* raid10, far layout initial sync slow + XFS question
@ 2023-09-01 20:23 CoolCold
2023-09-01 20:37 ` Roman Mamedov
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: CoolCold @ 2023-09-01 20:23 UTC (permalink / raw)
To: Linux RAID
Good day!
I have 4 NVMe new drives which are planned to replace 2 current NVMe
drives, serving primarily as MYSQL storage, Hetzner dedicated server
AX161 if it matters. Drives are SAMSUNG MZQL23T8HCLS-00A07, 3.8TB .
System - Ubuntu 20.04 / 5.4.0-153-generic #170-Ubuntu
So the strange thing I do observe, is its initial raid sync speed.
Created with:
mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
--chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
/dev/nvme5n1
sync speed:
md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
[=>...................] resync = 6.2% (466905632/7501212288)
finish=207.7min speed=564418K/sec
If I try to create RAID1 with just two drives - sync speed is around
3.2GByte per second, sysclt is tuned of course:
dev.raid.speed_limit_max = 8000000
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
[raid4] [raid10]
md70 : active raid1 nvme4n1[1] nvme5n1[0]
3750606144 blocks super 1.2 [2/2] [UU]
[>....................] resync = 1.5% (58270272/3750606144)
finish=19.0min speed=3237244K/sec
From iostat, drives are basically doing just READs, no writes.
Quick tests with fio, mounting single drive shows it can do around 30k
IOPS with 16kb ( fio --rw=write --ioengine=sync --fdatasync=1
--directory=test-data --size=8200m --bs=16k --name=mytest ) so likely
issue are not drives themselves.
Not sure where to look further, please advise.
--
Best regards,
[COOLCOLD-RIPN]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 20:23 raid10, far layout initial sync slow + XFS question CoolCold
@ 2023-09-01 20:37 ` Roman Mamedov
2023-09-01 20:43 ` CoolCold
2023-09-01 20:58 ` CoolCold
2023-09-02 3:56 ` Yu Kuai
2 siblings, 1 reply; 11+ messages in thread
From: Roman Mamedov @ 2023-09-01 20:37 UTC (permalink / raw)
To: CoolCold; +Cc: Linux RAID
Hello,
On Sat, 2 Sep 2023 03:23:00 +0700
CoolCold <coolthecold@gmail.com> wrote:
> So the strange thing I do observe, is its initial raid sync speed.
> Created with:
> mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
> --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
> /dev/nvme5n1
>
> sync speed:
>
> md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> [=>...................] resync = 6.2% (466905632/7501212288)
> finish=207.7min speed=564418K/sec
Any difference if you use e.g. --chunk=1024?
How about a newer kernel (such as 6.1)?
--
With respect,
Roman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 20:37 ` Roman Mamedov
@ 2023-09-01 20:43 ` CoolCold
2023-09-01 21:00 ` Roman Mamedov
0 siblings, 1 reply; 11+ messages in thread
From: CoolCold @ 2023-09-01 20:43 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Linux RAID
On Sat, Sep 2, 2023 at 3:37 AM Roman Mamedov <rm@romanrm.net> wrote:
>
> Hello,
>
> On Sat, 2 Sep 2023 03:23:00 +0700
> CoolCold <coolthecold@gmail.com> wrote:
>
> > So the strange thing I do observe, is its initial raid sync speed.
> > Created with:
> > mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
> > --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
> > /dev/nvme5n1
> >
> > sync speed:
> >
> > md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> > 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> > [=>...................] resync = 6.2% (466905632/7501212288)
> > finish=207.7min speed=564418K/sec
>
> Any difference if you use e.g. --chunk=1024?
Goes up to 1.4GB
md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
7501209600 blocks super 1.2 1024K chunks 2 far-copies [4/4] [UUUU]
[>....................] resync = 0.4% (35959488/7501209600)
finish=86.4min speed=1438382K/sec
>
> How about a newer kernel (such as 6.1)?
Not applicable in my case- there is no test machine unluckily to play
around with non LTS and reboots. Upgrading to next HWE kernel may
happen though, which is 5.15.0-82-generic #91-Ubuntu.
Do you know any specific patches/fixes landed since 5.4?
>
> --
> With respect,
> Roman
--
Best regards,
[COOLCOLD-RIPN]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 20:23 raid10, far layout initial sync slow + XFS question CoolCold
2023-09-01 20:37 ` Roman Mamedov
@ 2023-09-01 20:58 ` CoolCold
2023-09-02 3:56 ` Yu Kuai
2 siblings, 0 replies; 11+ messages in thread
From: CoolCold @ 2023-09-01 20:58 UTC (permalink / raw)
To: Linux RAID
For statistics, the same everything except layout:
Offset: around 700-780MB/sec
created: mdadm --create /dev/md3 --run -b none --level=10 --layout=o2
--chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
/dev/nvme5n1
md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
7501212288 blocks super 1.2 16K chunks 2 offset-copies [4/4] [UUUU]
[>....................] resync = 1.5% (119689152/7501212288)
finish=156.3min speed=786749K/sec
near:around 700MB/sec
created: mdadm --create /dev/md3 --run -b none --level=10 --layout=n2
--chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
/dev/nvme5n1
md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
7501212320 blocks super 1.2 16K chunks 2 near-copies [4/4] [UUUU]
[>....................] resync = 0.5% (42373104/7501212320)
finish=175.7min speed=707262K/sec
On Sat, Sep 2, 2023 at 3:23 AM CoolCold <coolthecold@gmail.com> wrote:
>
> Good day!
>
> I have 4 NVMe new drives which are planned to replace 2 current NVMe
> drives, serving primarily as MYSQL storage, Hetzner dedicated server
> AX161 if it matters. Drives are SAMSUNG MZQL23T8HCLS-00A07, 3.8TB .
> System - Ubuntu 20.04 / 5.4.0-153-generic #170-Ubuntu
>
> So the strange thing I do observe, is its initial raid sync speed.
> Created with:
> mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
> --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
> /dev/nvme5n1
>
> sync speed:
>
> md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> [=>...................] resync = 6.2% (466905632/7501212288)
> finish=207.7min speed=564418K/sec
>
> If I try to create RAID1 with just two drives - sync speed is around
> 3.2GByte per second, sysclt is tuned of course:
> dev.raid.speed_limit_max = 8000000
>
> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
> [raid4] [raid10]
> md70 : active raid1 nvme4n1[1] nvme5n1[0]
> 3750606144 blocks super 1.2 [2/2] [UU]
> [>....................] resync = 1.5% (58270272/3750606144)
> finish=19.0min speed=3237244K/sec
>
> From iostat, drives are basically doing just READs, no writes.
> Quick tests with fio, mounting single drive shows it can do around 30k
> IOPS with 16kb ( fio --rw=write --ioengine=sync --fdatasync=1
> --directory=test-data --size=8200m --bs=16k --name=mytest ) so likely
> issue are not drives themselves.
>
> Not sure where to look further, please advise.
>
> --
> Best regards,
> [COOLCOLD-RIPN]
--
Best regards,
[COOLCOLD-RIPN]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 20:43 ` CoolCold
@ 2023-09-01 21:00 ` Roman Mamedov
2023-09-01 21:17 ` CoolCold
0 siblings, 1 reply; 11+ messages in thread
From: Roman Mamedov @ 2023-09-01 21:00 UTC (permalink / raw)
To: CoolCold; +Cc: Linux RAID
On Sat, 2 Sep 2023 03:43:42 +0700
CoolCold <coolthecold@gmail.com> wrote:
> > > md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> > > 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> > > [=>...................] resync = 6.2% (466905632/7501212288)
> > > finish=207.7min speed=564418K/sec
> >
> > Any difference if you use e.g. --chunk=1024?
> Goes up to 1.4GB
>
> md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> 7501209600 blocks super 1.2 1024K chunks 2 far-copies [4/4] [UUUU]
> [>....................] resync = 0.4% (35959488/7501209600)
> finish=86.4min speed=1438382K/sec
Looks like you have found at least some bottleneck. Does it ever reach the
RAID1 performance at some point if you raise it further to 4096, 8192 or more?
It might also be worth it to try making the RAID with --assume-clean, and then
look at the actual array performance, not just the sync speed.
> > How about a newer kernel (such as 6.1)?
> Not applicable in my case- there is no test machine unluckily to play
> around with non LTS and reboots. Upgrading to next HWE kernel may
> happen though, which is 5.15.0-82-generic #91-Ubuntu.
> Do you know any specific patches/fixes landed since 5.4?
No idea. I guessed if you are just setting up a new server, it would be
possible to slip in a reboot or a few. :)
--
With respect,
Roman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 21:00 ` Roman Mamedov
@ 2023-09-01 21:17 ` CoolCold
2023-09-01 21:26 ` Roman Mamedov
0 siblings, 1 reply; 11+ messages in thread
From: CoolCold @ 2023-09-01 21:17 UTC (permalink / raw)
To: Roman Mamedov; +Cc: Linux RAID
On Sat, Sep 2, 2023 at 4:00 AM Roman Mamedov <rm@romanrm.net> wrote:
>
> On Sat, 2 Sep 2023 03:43:42 +0700
> CoolCold <coolthecold@gmail.com> wrote:
>
> > > > md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> > > > 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> > > > [=>...................] resync = 6.2% (466905632/7501212288)
> > > > finish=207.7min speed=564418K/sec
> > >
> > > Any difference if you use e.g. --chunk=1024?
> > Goes up to 1.4GB
> >
> > md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> > 7501209600 blocks super 1.2 1024K chunks 2 far-copies [4/4] [UUUU]
> > [>....................] resync = 0.4% (35959488/7501209600)
> > finish=86.4min speed=1438382K/sec
>
> Looks like you have found at least some bottleneck. Does it ever reach the
Definitely there is a bottleneck and I very much doubt I'm the first
one facing this - NVMe drives with > 1GB/sec are quite widespread.
> RAID1 performance at some point if you raise it further to 4096, 8192 or more?
I can try, for sake of testing, but in terms of practical outcome -
let's imagine with 8MB chunks it reaches maximum - what to do with
that knowledge?
>
> It might also be worth it to try making the RAID with --assume-clean, and then
> look at the actual array performance, not just the sync speed.
>
> > > How about a newer kernel (such as 6.1)?
> > Not applicable in my case- there is no test machine unluckily to play
> > around with non LTS and reboots. Upgrading to next HWE kernel may
> > happen though, which is 5.15.0-82-generic #91-Ubuntu.
> > Do you know any specific patches/fixes landed since 5.4?
>
> No idea. I guessed if you are just setting up a new server, it would be
> possible to slip in a reboot or a few. :)
Unluckily no - trying to speedup existing DB which is a master node in
this setup.
Full disclosure: I've actually done a "quick" test with those drives
handling the load - created RADI10/f2, added to VG, did pvmove for
mysql specific LV and observed. In terms of weighted io time, it
performed much better. Now, I've moved data back to "old" drives
(pvmove again) and preparing for "proper" not "quick" setup - NVME
sector size, XFS sunit/swidth change and so on.
>
> --
> With respect,
> Roman
--
Best regards,
[COOLCOLD-RIPN]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 21:17 ` CoolCold
@ 2023-09-01 21:26 ` Roman Mamedov
0 siblings, 0 replies; 11+ messages in thread
From: Roman Mamedov @ 2023-09-01 21:26 UTC (permalink / raw)
To: CoolCold; +Cc: Linux RAID
On Sat, 2 Sep 2023 04:17:46 +0700
CoolCold <coolthecold@gmail.com> wrote:
> Definitely there is a bottleneck and I very much doubt I'm the first
> one facing this - NVMe drives with > 1GB/sec are quite widespread.
>
> > RAID1 performance at some point if you raise it further to 4096, 8192 or more?
>
> I can try, for sake of testing, but in terms of practical outcome -
> let's imagine with 8MB chunks it reaches maximum - what to do with
> that knowledge?
Maybe then it would be easier for an mdadm developer to chime in and pinpoint
the reason why it might be slow for you at the smaller chunk sizes, provide a
hint if there were any commits improving that aspect in kernel versions later
than you use, etc.
--
With respect,
Roman
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-01 20:23 raid10, far layout initial sync slow + XFS question CoolCold
2023-09-01 20:37 ` Roman Mamedov
2023-09-01 20:58 ` CoolCold
@ 2023-09-02 3:56 ` Yu Kuai
2023-09-02 6:07 ` CoolCold
2 siblings, 1 reply; 11+ messages in thread
From: Yu Kuai @ 2023-09-02 3:56 UTC (permalink / raw)
To: CoolCold, Linux RAID; +Cc: yukuai (C)
Hi,
在 2023/09/02 4:23, CoolCold 写道:
> Good day!
>
> I have 4 NVMe new drives which are planned to replace 2 current NVMe
> drives, serving primarily as MYSQL storage, Hetzner dedicated server
> AX161 if it matters. Drives are SAMSUNG MZQL23T8HCLS-00A07, 3.8TB .
> System - Ubuntu 20.04 / 5.4.0-153-generic #170-Ubuntu
>
> So the strange thing I do observe, is its initial raid sync speed.
> Created with:
> mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
> --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
> /dev/nvme5n1
>
> sync speed:
>
> md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> [=>...................] resync = 6.2% (466905632/7501212288)
> finish=207.7min speed=564418K/sec
>
Is there any read/write to the array? Because for raid10, normal io
can't concurrent with sync io, brandwidth will be bad if they both exit,
specially for old kernels.
Thanks,
Kuai
> If I try to create RAID1 with just two drives - sync speed is around
> 3.2GByte per second, sysclt is tuned of course:
> dev.raid.speed_limit_max = 8000000
>
> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
> [raid4] [raid10]
> md70 : active raid1 nvme4n1[1] nvme5n1[0]
> 3750606144 blocks super 1.2 [2/2] [UU]
> [>....................] resync = 1.5% (58270272/3750606144)
> finish=19.0min speed=3237244K/sec
>
>>From iostat, drives are basically doing just READs, no writes.
> Quick tests with fio, mounting single drive shows it can do around 30k
> IOPS with 16kb ( fio --rw=write --ioengine=sync --fdatasync=1
> --directory=test-data --size=8200m --bs=16k --name=mytest ) so likely
> issue are not drives themselves.
>
> Not sure where to look further, please advise.
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-02 3:56 ` Yu Kuai
@ 2023-09-02 6:07 ` CoolCold
2023-09-02 6:13 ` Yu Kuai
0 siblings, 1 reply; 11+ messages in thread
From: CoolCold @ 2023-09-02 6:07 UTC (permalink / raw)
To: Yu Kuai; +Cc: Linux RAID, yukuai (C)
Good day!
No, no other activities happened during initial sync - at least I have
not done anything. In iostat it were only read operations as well.
On Sat, 2 Sept 2023, 10:57 Yu Kuai, <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2023/09/02 4:23, CoolCold 写道:
> > Good day!
> >
> > I have 4 NVMe new drives which are planned to replace 2 current NVMe
> > drives, serving primarily as MYSQL storage, Hetzner dedicated server
> > AX161 if it matters. Drives are SAMSUNG MZQL23T8HCLS-00A07, 3.8TB .
> > System - Ubuntu 20.04 / 5.4.0-153-generic #170-Ubuntu
> >
> > So the strange thing I do observe, is its initial raid sync speed.
> > Created with:
> > mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
> > --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
> > /dev/nvme5n1
> >
> > sync speed:
> >
> > md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> > 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> > [=>...................] resync = 6.2% (466905632/7501212288)
> > finish=207.7min speed=564418K/sec
> >
> Is there any read/write to the array? Because for raid10, normal io
> can't concurrent with sync io, brandwidth will be bad if they both exit,
> specially for old kernels.
>
> Thanks,
> Kuai
>
> > If I try to create RAID1 with just two drives - sync speed is around
> > 3.2GByte per second, sysclt is tuned of course:
> > dev.raid.speed_limit_max = 8000000
> >
> > Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
> > [raid4] [raid10]
> > md70 : active raid1 nvme4n1[1] nvme5n1[0]
> > 3750606144 blocks super 1.2 [2/2] [UU]
> > [>....................] resync = 1.5% (58270272/3750606144)
> > finish=19.0min speed=3237244K/sec
> >
> >>From iostat, drives are basically doing just READs, no writes.
> > Quick tests with fio, mounting single drive shows it can do around 30k
> > IOPS with 16kb ( fio --rw=write --ioengine=sync --fdatasync=1
> > --directory=test-data --size=8200m --bs=16k --name=mytest ) so likely
> > issue are not drives themselves.
> >
> > Not sure where to look further, please advise.
> >
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-02 6:07 ` CoolCold
@ 2023-09-02 6:13 ` Yu Kuai
2023-09-02 6:39 ` CoolCold
0 siblings, 1 reply; 11+ messages in thread
From: Yu Kuai @ 2023-09-02 6:13 UTC (permalink / raw)
To: CoolCold, Yu Kuai; +Cc: Linux RAID, yukuai (C), yangerkun@huawei.com
Hi,
在 2023/09/02 14:07, CoolCold 写道:
> Good day!
> No, no other activities happened during initial sync - at least I have
> not done anything. In iostat it were only read operations as well.
>
I think this is because resync is different for raid1 and raid10,
In raid1, just read from one rdev and write to the other rdev.
In raid10, resync must read from all rdev first, and then compare and
write if contents is different, it is obvious that raid10 will be slower
if contents is different. However, I do feel this behavior is strange,
because contents is likely different in initial sync.
Thanks,
Kuai
>
> On Sat, 2 Sept 2023, 10:57 Yu Kuai, <yukuai1@huaweicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2023/09/02 4:23, CoolCold 写道:
>>> Good day!
>>>
>>> I have 4 NVMe new drives which are planned to replace 2 current NVMe
>>> drives, serving primarily as MYSQL storage, Hetzner dedicated server
>>> AX161 if it matters. Drives are SAMSUNG MZQL23T8HCLS-00A07, 3.8TB .
>>> System - Ubuntu 20.04 / 5.4.0-153-generic #170-Ubuntu
>>>
>>> So the strange thing I do observe, is its initial raid sync speed.
>>> Created with:
>>> mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
>>> --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
>>> /dev/nvme5n1
>>>
>>> sync speed:
>>>
>>> md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
>>> 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
>>> [=>...................] resync = 6.2% (466905632/7501212288)
>>> finish=207.7min speed=564418K/sec
>>>
>> Is there any read/write to the array? Because for raid10, normal io
>> can't concurrent with sync io, brandwidth will be bad if they both exit,
>> specially for old kernels.
>>
>> Thanks,
>> Kuai
>>
>>> If I try to create RAID1 with just two drives - sync speed is around
>>> 3.2GByte per second, sysclt is tuned of course:
>>> dev.raid.speed_limit_max = 8000000
>>>
>>> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
>>> [raid4] [raid10]
>>> md70 : active raid1 nvme4n1[1] nvme5n1[0]
>>> 3750606144 blocks super 1.2 [2/2] [UU]
>>> [>....................] resync = 1.5% (58270272/3750606144)
>>> finish=19.0min speed=3237244K/sec
>>>
>>> >From iostat, drives are basically doing just READs, no writes.
>>> Quick tests with fio, mounting single drive shows it can do around 30k
>>> IOPS with 16kb ( fio --rw=write --ioengine=sync --fdatasync=1
>>> --directory=test-data --size=8200m --bs=16k --name=mytest ) so likely
>>> issue are not drives themselves.
>>>
>>> Not sure where to look further, please advise.
>>>
>>
>
> .
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: raid10, far layout initial sync slow + XFS question
2023-09-02 6:13 ` Yu Kuai
@ 2023-09-02 6:39 ` CoolCold
0 siblings, 0 replies; 11+ messages in thread
From: CoolCold @ 2023-09-02 6:39 UTC (permalink / raw)
To: Yu Kuai; +Cc: Linux RAID, yukuai (C), yangerkun@huawei.com
Makes some sense, but from my perspective not much sense - in any
practical case I can remember, creating a FRESH/NEW array silently
assumes you want to have it be filled by zeroes. When you care about
actual drive content - there is an "assume-clean" option.
Will option like "--fill-with-zeroes" make sense to do "no compare"
and just write the same data on all drives?
My current drives are relatively small - 3.8TB, but 15TB and 30TB
NVMes exist and I hardly can imagine provisioning such server where
raid10 sync will take WEEK?
Highly likely I'm not the first one facing this behavior and larger
market players have already met something of this sort.
On Sat, Sep 2, 2023 at 1:13 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2023/09/02 14:07, CoolCold 写道:
> > Good day!
> > No, no other activities happened during initial sync - at least I have
> > not done anything. In iostat it were only read operations as well.
> >
> I think this is because resync is different for raid1 and raid10,
>
> In raid1, just read from one rdev and write to the other rdev.
>
> In raid10, resync must read from all rdev first, and then compare and
> write if contents is different, it is obvious that raid10 will be slower
> if contents is different. However, I do feel this behavior is strange,
> because contents is likely different in initial sync.
>
> Thanks,
> Kuai
>
> >
> > On Sat, 2 Sept 2023, 10:57 Yu Kuai, <yukuai1@huaweicloud.com> wrote:
> >>
> >> Hi,
> >>
> >> 在 2023/09/02 4:23, CoolCold 写道:
> >>> Good day!
> >>>
> >>> I have 4 NVMe new drives which are planned to replace 2 current NVMe
> >>> drives, serving primarily as MYSQL storage, Hetzner dedicated server
> >>> AX161 if it matters. Drives are SAMSUNG MZQL23T8HCLS-00A07, 3.8TB .
> >>> System - Ubuntu 20.04 / 5.4.0-153-generic #170-Ubuntu
> >>>
> >>> So the strange thing I do observe, is its initial raid sync speed.
> >>> Created with:
> >>> mdadm --create /dev/md3 --run -b none --level=10 --layout=f2
> >>> --chunk=16 --raid-devices=4 /dev/nvme0n1 /dev/nvme4n1 /dev/nvme3n1
> >>> /dev/nvme5n1
> >>>
> >>> sync speed:
> >>>
> >>> md3 : active raid10 nvme5n1[3] nvme3n1[2] nvme4n1[1] nvme0n1[0]
> >>> 7501212288 blocks super 1.2 16K chunks 2 far-copies [4/4] [UUUU]
> >>> [=>...................] resync = 6.2% (466905632/7501212288)
> >>> finish=207.7min speed=564418K/sec
> >>>
> >> Is there any read/write to the array? Because for raid10, normal io
> >> can't concurrent with sync io, brandwidth will be bad if they both exit,
> >> specially for old kernels.
> >>
> >> Thanks,
> >> Kuai
> >>
> >>> If I try to create RAID1 with just two drives - sync speed is around
> >>> 3.2GByte per second, sysclt is tuned of course:
> >>> dev.raid.speed_limit_max = 8000000
> >>>
> >>> Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5]
> >>> [raid4] [raid10]
> >>> md70 : active raid1 nvme4n1[1] nvme5n1[0]
> >>> 3750606144 blocks super 1.2 [2/2] [UU]
> >>> [>....................] resync = 1.5% (58270272/3750606144)
> >>> finish=19.0min speed=3237244K/sec
> >>>
> >>> >From iostat, drives are basically doing just READs, no writes.
> >>> Quick tests with fio, mounting single drive shows it can do around 30k
> >>> IOPS with 16kb ( fio --rw=write --ioengine=sync --fdatasync=1
> >>> --directory=test-data --size=8200m --bs=16k --name=mytest ) so likely
> >>> issue are not drives themselves.
> >>>
> >>> Not sure where to look further, please advise.
> >>>
> >>
> >
> > .
> >
>
--
Best regards,
[COOLCOLD-RIPN]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-09-02 6:40 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-01 20:23 raid10, far layout initial sync slow + XFS question CoolCold
2023-09-01 20:37 ` Roman Mamedov
2023-09-01 20:43 ` CoolCold
2023-09-01 21:00 ` Roman Mamedov
2023-09-01 21:17 ` CoolCold
2023-09-01 21:26 ` Roman Mamedov
2023-09-01 20:58 ` CoolCold
2023-09-02 3:56 ` Yu Kuai
2023-09-02 6:07 ` CoolCold
2023-09-02 6:13 ` Yu Kuai
2023-09-02 6:39 ` CoolCold
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox