* problems with lots of arrays
@ 2016-05-05 23:24 Mike Lovell
2016-05-06 6:43 ` NeilBrown
2016-05-10 20:48 ` Jes Sorensen
0 siblings, 2 replies; 10+ messages in thread
From: Mike Lovell @ 2016-05-05 23:24 UTC (permalink / raw)
To: linux-raid
we have a number of systems that have a large number of software
arrays running. its in the couple hundred range. we have been using a
custom built kernel based on 3.4 but are wanting to update to a
mainline kernel and have been experimenting with 4.4. the systems are
running recent centos 6 releases but we have been downgrading the
mdadm version from 3.3.2 in 6.7 to a custom build 3.2.6. we installed
the downgraded version due to a problem with array numbering. i
emailed the list a while ago explaining the issue and submitting a
patch to fix [1]. i never heard anything back and since we had a
simple fix i didn't follow up on it.
unfortunately, when testing the 3.2.6 mdadm with linux kernel 4.4
wasn't working. mdadm and the kernel would complain about the devices
not having a valid v1.2 superblock and not start the array. testing
with 3.3.2 from the current centos repos worked. i'd like to update
but we still have the issue with lots of arrays mentioned previously.
i spent some time checking to make sure that my patch rebases against
master properly (and it does) but during testing i was unable to
create an array with number larger than /dev/md511 when using the 4.4
kernel we were testing as well as the 4.2 kernel i had on another test
box. creating one larger than 511 on a system with a 3.16 kernel
worked. it looks like something broke between kernel 3.16 and 4.2 that
limited the number of arrays to 512 (/dev/md0 to /dev/md511). this was
a problem regardless of mdadm version and i haven't yet done much
digging into the problem.
there are a couple things that could potentially be done. the easiest,
would be to just modify find_free_devnm() in mdopen.c from wrapping to
(1<<20)-1 and instead have it wrap around to (1<<9))-1. this would
limit mdadm to 512 auto-generated array numbers. i'm guessing this
would be sufficient for the vast majority of cases and would solve the
problem i'm facing at work. the next option would be to apply the
patch in my previous email and then figuring out why the newer
versions of the kernel don't support more than 512 arrays. this would
take more work but probably the better long term approach.
what do you all think?
thanks
mike
[1] http://marc.info/?l=linux-raid&m=142387809409798&w=2
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-05 23:24 problems with lots of arrays Mike Lovell
@ 2016-05-06 6:43 ` NeilBrown
2016-05-06 17:02 ` Mike Lovell
2016-05-10 20:48 ` Jes Sorensen
1 sibling, 1 reply; 10+ messages in thread
From: NeilBrown @ 2016-05-06 6:43 UTC (permalink / raw)
To: Mike Lovell, linux-raid
[-- Attachment #1: Type: text/plain, Size: 3581 bytes --]
On Fri, May 06 2016, Mike Lovell wrote:
> we have a number of systems that have a large number of software
> arrays running. its in the couple hundred range. we have been using a
> custom built kernel based on 3.4 but are wanting to update to a
> mainline kernel and have been experimenting with 4.4. the systems are
> running recent centos 6 releases but we have been downgrading the
> mdadm version from 3.3.2 in 6.7 to a custom build 3.2.6. we installed
> the downgraded version due to a problem with array numbering. i
> emailed the list a while ago explaining the issue and submitting a
> patch to fix [1]. i never heard anything back and since we had a
> simple fix i didn't follow up on it.
>
> unfortunately, when testing the 3.2.6 mdadm with linux kernel 4.4
> wasn't working. mdadm and the kernel would complain about the devices
> not having a valid v1.2 superblock and not start the array. testing
> with 3.3.2 from the current centos repos worked. i'd like to update
> but we still have the issue with lots of arrays mentioned previously.
>
> i spent some time checking to make sure that my patch rebases against
> master properly (and it does) but during testing i was unable to
> create an array with number larger than /dev/md511 when using the 4.4
> kernel we were testing as well as the 4.2 kernel i had on another test
> box. creating one larger than 511 on a system with a 3.16 kernel
> worked. it looks like something broke between kernel 3.16 and 4.2 that
> limited the number of arrays to 512 (/dev/md0 to /dev/md511). this was
> a problem regardless of mdadm version and i haven't yet done much
> digging into the problem.
>
> there are a couple things that could potentially be done. the easiest,
> would be to just modify find_free_devnm() in mdopen.c from wrapping to
> (1<<20)-1 and instead have it wrap around to (1<<9))-1. this would
> limit mdadm to 512 auto-generated array numbers. i'm guessing this
> would be sufficient for the vast majority of cases and would solve the
> problem i'm facing at work. the next option would be to apply the
> patch in my previous email and then figuring out why the newer
> versions of the kernel don't support more than 512 arrays. this would
> take more work but probably the better long term approach.
>
I know why newer kernels don't seem to support more than 512 array.
Commit: af5628f05db6 ("md: disable probing for md devices 512 and over.")
You can easily use many more md devices by using a newish mdadm and
setting
CREATE names=yes
in /etc/mdadm.conf
You cannot use names like "md512" because that gets confusing, but any
name that isn't a string of digits is fine. e.g. create /dev/md/foo
and the array will be named "md_foo" in the kernel rather than "md127".
I guess this qualifies as a regression and regressions are bad.....
But I really wanted to be able to have arrays that didn't get magically
created simply because you open a file in /dev. That just leads to
races with udev.
The magic number "512" appears three times in the kernel.
/* find an unused unit number */
static int next_minor = 512;
and
blk_register_region(MKDEV(MD_MAJOR, 0), 512, THIS_MODULE,
md_probe, NULL, NULL);
and
blk_unregister_region(MKDEV(MD_MAJOR,0), 512);
A boot parameter which set that to something larger would probably be OK
and would solve your immediate problem.
But if you could transition to using named arrays instead of numbered
arrays - even if that are "/dev/md/X%d", that would be be good I think.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-06 6:43 ` NeilBrown
@ 2016-05-06 17:02 ` Mike Lovell
2016-05-06 17:59 ` Mike Lovell
0 siblings, 1 reply; 10+ messages in thread
From: Mike Lovell @ 2016-05-06 17:02 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
On Fri, May 6, 2016 at 12:43 AM, NeilBrown <nfbrown@novell.com> wrote:
> I know why newer kernels don't seem to support more than 512 array.
>
> Commit: af5628f05db6 ("md: disable probing for md devices 512 and over.")
>
>
> You can easily use many more md devices by using a newish mdadm and
> setting
>
> CREATE names=yes
>
> in /etc/mdadm.conf
>
> You cannot use names like "md512" because that gets confusing, but any
> name that isn't a string of digits is fine. e.g. create /dev/md/foo
> and the array will be named "md_foo" in the kernel rather than "md127".
>
> I guess this qualifies as a regression and regressions are bad.....
> But I really wanted to be able to have arrays that didn't get magically
> created simply because you open a file in /dev. That just leads to
> races with udev.
>
> The magic number "512" appears three times in the kernel.
>
> /* find an unused unit number */
> static int next_minor = 512;
>
> and
>
> blk_register_region(MKDEV(MD_MAJOR, 0), 512, THIS_MODULE,
> md_probe, NULL, NULL);
> and
> blk_unregister_region(MKDEV(MD_MAJOR,0), 512);
>
> A boot parameter which set that to something larger would probably be OK
> and would solve your immediate problem.
>
> But if you could transition to using named arrays instead of numbered
> arrays - even if that are "/dev/md/X%d", that would be be good I think.
>
> NeilBrown
we actually do specify the name to mdadm --create and mdadm --assemble
and have a naming scheme from our own internal tools. the problem we
were running into was that mdadm would auto-generate a minor number
that was invalid but we also don't have "CREATE names=yes" in
mdadm.conf. i'll have to experiment with that one.
thanks
mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-06 17:02 ` Mike Lovell
@ 2016-05-06 17:59 ` Mike Lovell
2016-05-06 23:13 ` NeilBrown
0 siblings, 1 reply; 10+ messages in thread
From: Mike Lovell @ 2016-05-06 17:59 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
On Fri, May 6, 2016 at 11:02 AM, Mike Lovell <mike.lovell@endurance.com> wrote:
> On Fri, May 6, 2016 at 12:43 AM, NeilBrown <nfbrown@novell.com> wrote:
>> I know why newer kernels don't seem to support more than 512 array.
>>
>> Commit: af5628f05db6 ("md: disable probing for md devices 512 and over.")
>>
>>
>> You can easily use many more md devices by using a newish mdadm and
>> setting
>>
>> CREATE names=yes
>>
>> in /etc/mdadm.conf
>>
>> You cannot use names like "md512" because that gets confusing, but any
>> name that isn't a string of digits is fine. e.g. create /dev/md/foo
>> and the array will be named "md_foo" in the kernel rather than "md127".
>>
>> I guess this qualifies as a regression and regressions are bad.....
>> But I really wanted to be able to have arrays that didn't get magically
>> created simply because you open a file in /dev. That just leads to
>> races with udev.
>>
>> The magic number "512" appears three times in the kernel.
>>
>> /* find an unused unit number */
>> static int next_minor = 512;
>>
>> and
>>
>> blk_register_region(MKDEV(MD_MAJOR, 0), 512, THIS_MODULE,
>> md_probe, NULL, NULL);
>> and
>> blk_unregister_region(MKDEV(MD_MAJOR,0), 512);
>>
>> A boot parameter which set that to something larger would probably be OK
>> and would solve your immediate problem.
>>
>> But if you could transition to using named arrays instead of numbered
>> arrays - even if that are "/dev/md/X%d", that would be be good I think.
>>
>> NeilBrown
>
> we actually do specify the name to mdadm --create and mdadm --assemble
> and have a naming scheme from our own internal tools. the problem we
> were running into was that mdadm would auto-generate a minor number
> that was invalid but we also don't have "CREATE names=yes" in
> mdadm.conf. i'll have to experiment with that one.
i just tested with "CREATE names=yes" in /etc/mdadm.conf and using
some test names seems to work properly. the array was created using
the name and the kernel chose minor numbers starting at 512. i then
tried some of our management tools and things failed. it looks like
its having a problem with our naming scheme. its using names that are
a little over 30 characters with - and _ in them. are there supposed
to be any restrictions on the array name?
specifically, this is what happened from mdadm when trying. (names
changes to protect the innocent :) )
$ sudo mdadm -A /dev/md/test-volume_a-123456_123456 /dev/dm-1 /dev/dm-2
*** buffer overflow detected ***: mdadm terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x7f7e50f40567]
/lib64/libc.so.6(+0x100450)[0x7f7e50f3e450]
/lib64/libc.so.6(+0xff8a9)[0x7f7e50f3d8a9]
/lib64/libc.so.6(_IO_default_xsputn+0xc9)[0x7f7e50eb2639]
/lib64/libc.so.6(_IO_vfprintf+0x41c0)[0x7f7e50e86190]
/lib64/libc.so.6(__vsprintf_chk+0x9d)[0x7f7e50f3d94d]
/lib64/libc.so.6(__sprintf_chk+0x7f)[0x7f7e50f3d88f]
mdadm[0x43068e]
mdadm[0x417089]
mdadm[0x4058a4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7f7e50e5cd5d]
mdadm[0x402ca9]
======= Memory map: ========
00400000-0046d000 r-xp 00000000 09:00 16908296
/sbin/mdadm
0066d000-00674000 rw-p 0006d000 09:00 16908296
/sbin/mdadm
00674000-00687000 rw-p 00000000 00:00 0
00fbb000-00fdc000 rw-p 00000000 00:00 0 [heap]
7f7e50c28000-7f7e50c3e000 r-xp 00000000 09:00 18874642
/lib64/libgcc_s-4.4.7-20120601.so.1
7f7e50c3e000-7f7e50e3d000 ---p 00016000 09:00 18874642
/lib64/libgcc_s-4.4.7-20120601.so.1
7f7e50e3d000-7f7e50e3e000 rw-p 00015000 09:00 18874642
/lib64/libgcc_s-4.4.7-20120601.so.1
7f7e50e3e000-7f7e50fc8000 r-xp 00000000 09:00 18874440
/lib64/libc-2.12.so
7f7e50fc8000-7f7e511c8000 ---p 0018a000 09:00 18874440
/lib64/libc-2.12.so
7f7e511c8000-7f7e511cc000 r--p 0018a000 09:00 18874440
/lib64/libc-2.12.so
7f7e511cc000-7f7e511cd000 rw-p 0018e000 09:00 18874440
/lib64/libc-2.12.so
7f7e511cd000-7f7e511d2000 rw-p 00000000 00:00 0
7f7e511d2000-7f7e511f2000 r-xp 00000000 09:00 18874758
/lib64/ld-2.12.so
7f7e513e5000-7f7e513e8000 rw-p 00000000 00:00 0
7f7e513ee000-7f7e513f1000 rw-p 00000000 00:00 0
7f7e513f1000-7f7e513f2000 r--p 0001f000 09:00 18874758
/lib64/ld-2.12.so
7f7e513f2000-7f7e513f3000 rw-p 00020000 09:00 18874758
/lib64/ld-2.12.so
7f7e513f3000-7f7e513f4000 rw-p 00000000 00:00 0
7ffe90a1f000-7ffe90a40000 rw-p 00000000 00:00 0 [stack]
7ffe90ab1000-7ffe90ab3000 r--p 00000000 00:00 0 [vvar]
7ffe90ab3000-7ffe90ab5000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
this was with kernel 4.4.8 and mdadm 3.3.2-5.el6
thanks
mike
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-06 17:59 ` Mike Lovell
@ 2016-05-06 23:13 ` NeilBrown
0 siblings, 0 replies; 10+ messages in thread
From: NeilBrown @ 2016-05-06 23:13 UTC (permalink / raw)
To: Mike Lovell; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 5496 bytes --]
On Sat, May 07 2016, Mike Lovell wrote:
> On Fri, May 6, 2016 at 11:02 AM, Mike Lovell <mike.lovell@endurance.com> wrote:
>> On Fri, May 6, 2016 at 12:43 AM, NeilBrown <nfbrown@novell.com> wrote:
>>> I know why newer kernels don't seem to support more than 512 array.
>>>
>>> Commit: af5628f05db6 ("md: disable probing for md devices 512 and over.")
>>>
>>>
>>> You can easily use many more md devices by using a newish mdadm and
>>> setting
>>>
>>> CREATE names=yes
>>>
>>> in /etc/mdadm.conf
>>>
>>> You cannot use names like "md512" because that gets confusing, but any
>>> name that isn't a string of digits is fine. e.g. create /dev/md/foo
>>> and the array will be named "md_foo" in the kernel rather than "md127".
>>>
>>> I guess this qualifies as a regression and regressions are bad.....
>>> But I really wanted to be able to have arrays that didn't get magically
>>> created simply because you open a file in /dev. That just leads to
>>> races with udev.
>>>
>>> The magic number "512" appears three times in the kernel.
>>>
>>> /* find an unused unit number */
>>> static int next_minor = 512;
>>>
>>> and
>>>
>>> blk_register_region(MKDEV(MD_MAJOR, 0), 512, THIS_MODULE,
>>> md_probe, NULL, NULL);
>>> and
>>> blk_unregister_region(MKDEV(MD_MAJOR,0), 512);
>>>
>>> A boot parameter which set that to something larger would probably be OK
>>> and would solve your immediate problem.
>>>
>>> But if you could transition to using named arrays instead of numbered
>>> arrays - even if that are "/dev/md/X%d", that would be be good I think.
>>>
>>> NeilBrown
>>
>> we actually do specify the name to mdadm --create and mdadm --assemble
>> and have a naming scheme from our own internal tools. the problem we
>> were running into was that mdadm would auto-generate a minor number
>> that was invalid but we also don't have "CREATE names=yes" in
>> mdadm.conf. i'll have to experiment with that one.
>
> i just tested with "CREATE names=yes" in /etc/mdadm.conf and using
> some test names seems to work properly. the array was created using
> the name and the kernel chose minor numbers starting at 512. i then
> tried some of our management tools and things failed. it looks like
> its having a problem with our naming scheme. its using names that are
> a little over 30 characters with - and _ in them. are there supposed
> to be any restrictions on the array name?
The kernel imposes a limit on the size of disk names:
#define DISK_NAME_LEN 32
1 byte is needed for the trailing '\0' and 3 for the leading "md_" so 28
are available for md device names.
That doesn't excuse mdadm for behaving so badly. There are quite a lot
of "sprintf"s in mdadm that should probably be "snprintf", and which
should possible have extra error checking.
I susect you are hitting something that sprintfs into
char devnm[32];
in create_mddev. That should definitely be larger.
NeilBrown
>
> specifically, this is what happened from mdadm when trying. (names
> changes to protect the innocent :) )
>
> $ sudo mdadm -A /dev/md/test-volume_a-123456_123456 /dev/dm-1 /dev/dm-2
> *** buffer overflow detected ***: mdadm terminated
> ======= Backtrace: =========
> /lib64/libc.so.6(__fortify_fail+0x37)[0x7f7e50f40567]
> /lib64/libc.so.6(+0x100450)[0x7f7e50f3e450]
> /lib64/libc.so.6(+0xff8a9)[0x7f7e50f3d8a9]
> /lib64/libc.so.6(_IO_default_xsputn+0xc9)[0x7f7e50eb2639]
> /lib64/libc.so.6(_IO_vfprintf+0x41c0)[0x7f7e50e86190]
> /lib64/libc.so.6(__vsprintf_chk+0x9d)[0x7f7e50f3d94d]
> /lib64/libc.so.6(__sprintf_chk+0x7f)[0x7f7e50f3d88f]
> mdadm[0x43068e]
> mdadm[0x417089]
> mdadm[0x4058a4]
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f7e50e5cd5d]
> mdadm[0x402ca9]
> ======= Memory map: ========
> 00400000-0046d000 r-xp 00000000 09:00 16908296
> /sbin/mdadm
> 0066d000-00674000 rw-p 0006d000 09:00 16908296
> /sbin/mdadm
> 00674000-00687000 rw-p 00000000 00:00 0
> 00fbb000-00fdc000 rw-p 00000000 00:00 0 [heap]
> 7f7e50c28000-7f7e50c3e000 r-xp 00000000 09:00 18874642
> /lib64/libgcc_s-4.4.7-20120601.so.1
> 7f7e50c3e000-7f7e50e3d000 ---p 00016000 09:00 18874642
> /lib64/libgcc_s-4.4.7-20120601.so.1
> 7f7e50e3d000-7f7e50e3e000 rw-p 00015000 09:00 18874642
> /lib64/libgcc_s-4.4.7-20120601.so.1
> 7f7e50e3e000-7f7e50fc8000 r-xp 00000000 09:00 18874440
> /lib64/libc-2.12.so
> 7f7e50fc8000-7f7e511c8000 ---p 0018a000 09:00 18874440
> /lib64/libc-2.12.so
> 7f7e511c8000-7f7e511cc000 r--p 0018a000 09:00 18874440
> /lib64/libc-2.12.so
> 7f7e511cc000-7f7e511cd000 rw-p 0018e000 09:00 18874440
> /lib64/libc-2.12.so
> 7f7e511cd000-7f7e511d2000 rw-p 00000000 00:00 0
> 7f7e511d2000-7f7e511f2000 r-xp 00000000 09:00 18874758
> /lib64/ld-2.12.so
> 7f7e513e5000-7f7e513e8000 rw-p 00000000 00:00 0
> 7f7e513ee000-7f7e513f1000 rw-p 00000000 00:00 0
> 7f7e513f1000-7f7e513f2000 r--p 0001f000 09:00 18874758
> /lib64/ld-2.12.so
> 7f7e513f2000-7f7e513f3000 rw-p 00020000 09:00 18874758
> /lib64/ld-2.12.so
> 7f7e513f3000-7f7e513f4000 rw-p 00000000 00:00 0
> 7ffe90a1f000-7ffe90a40000 rw-p 00000000 00:00 0 [stack]
> 7ffe90ab1000-7ffe90ab3000 r--p 00000000 00:00 0 [vvar]
> 7ffe90ab3000-7ffe90ab5000 r-xp 00000000 00:00 0 [vdso]
> ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
> [vsyscall]
>
> this was with kernel 4.4.8 and mdadm 3.3.2-5.el6
>
> thanks
> mike
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-05 23:24 problems with lots of arrays Mike Lovell
2016-05-06 6:43 ` NeilBrown
@ 2016-05-10 20:48 ` Jes Sorensen
2016-05-10 22:39 ` NeilBrown
1 sibling, 1 reply; 10+ messages in thread
From: Jes Sorensen @ 2016-05-10 20:48 UTC (permalink / raw)
To: Mike Lovell; +Cc: linux-raid, NeilBrown
Mike Lovell <mike.lovell@endurance.com> writes:
> we have a number of systems that have a large number of software
> arrays running. its in the couple hundred range. we have been using a
> custom built kernel based on 3.4 but are wanting to update to a
> mainline kernel and have been experimenting with 4.4. the systems are
> running recent centos 6 releases but we have been downgrading the
> mdadm version from 3.3.2 in 6.7 to a custom build 3.2.6. we installed
> the downgraded version due to a problem with array numbering. i
> emailed the list a while ago explaining the issue and submitting a
> patch to fix [1]. i never heard anything back and since we had a
> simple fix i didn't follow up on it.
[snip]
> what do you all think?
>
> thanks
> mike
>
> [1] http://marc.info/?l=linux-raid&m=142387809409798&w=2
Staying consistent in using dev_t rather than casting back and forth to
int seems a reasonable fix to apply to mdadm. It obviously won't change
the issues with the newer kernels, but I don't see any reason why we
shouldn't apply that fix to mdadm.
Neil any thoughts on this?
Cheers,
Jes
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-10 20:48 ` Jes Sorensen
@ 2016-05-10 22:39 ` NeilBrown
2016-05-11 0:45 ` Shaohua Li
0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2016-05-10 22:39 UTC (permalink / raw)
To: Jes Sorensen, Mike Lovell; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1850 bytes --]
On Wed, May 11 2016, Jes Sorensen wrote:
> Mike Lovell <mike.lovell@endurance.com> writes:
>> we have a number of systems that have a large number of software
>> arrays running. its in the couple hundred range. we have been using a
>> custom built kernel based on 3.4 but are wanting to update to a
>> mainline kernel and have been experimenting with 4.4. the systems are
>> running recent centos 6 releases but we have been downgrading the
>> mdadm version from 3.3.2 in 6.7 to a custom build 3.2.6. we installed
>> the downgraded version due to a problem with array numbering. i
>> emailed the list a while ago explaining the issue and submitting a
>> patch to fix [1]. i never heard anything back and since we had a
>> simple fix i didn't follow up on it.
>
> [snip]
>
>> what do you all think?
>>
>> thanks
>> mike
>>
>> [1] http://marc.info/?l=linux-raid&m=142387809409798&w=2
>
> Staying consistent in using dev_t rather than casting back and forth to
> int seems a reasonable fix to apply to mdadm. It obviously won't change
> the issues with the newer kernels, but I don't see any reason why we
> shouldn't apply that fix to mdadm.
>
> Neil any thoughts on this?
I agree that changing "int" to "dev_t" is a good idea.
We should really fix the more general problem too.
On any kernel with /sys/module/md_mod/parameters/new_array
find_free_devnm avoid trying anything above 511. (1<<9)-1.
If that fails to find a free number, then it should probably try a name
like "md_NN" and act as though ci->name is set.
Also, when a "name" given for the md array that is longer than 28 bytes
we need to fall back to choose an array name ourselves even if ci->name
is set. Start with md_512 and work upwards.
Rather than probing we should read /sys/block looking for "md_*" and
maybe choose 1 more than the largest number found.
Thanks,
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-10 22:39 ` NeilBrown
@ 2016-05-11 0:45 ` Shaohua Li
2016-05-12 1:55 ` NeilBrown
0 siblings, 1 reply; 10+ messages in thread
From: Shaohua Li @ 2016-05-11 0:45 UTC (permalink / raw)
To: NeilBrown; +Cc: Jes Sorensen, Mike Lovell, linux-raid
On Wed, May 11, 2016 at 08:39:53AM +1000, NeilBrown wrote:
> On Wed, May 11 2016, Jes Sorensen wrote:
>
> > Mike Lovell <mike.lovell@endurance.com> writes:
> >> we have a number of systems that have a large number of software
> >> arrays running. its in the couple hundred range. we have been using a
> >> custom built kernel based on 3.4 but are wanting to update to a
> >> mainline kernel and have been experimenting with 4.4. the systems are
> >> running recent centos 6 releases but we have been downgrading the
> >> mdadm version from 3.3.2 in 6.7 to a custom build 3.2.6. we installed
> >> the downgraded version due to a problem with array numbering. i
> >> emailed the list a while ago explaining the issue and submitting a
> >> patch to fix [1]. i never heard anything back and since we had a
> >> simple fix i didn't follow up on it.
> >
> > [snip]
> >
> >> what do you all think?
> >>
> >> thanks
> >> mike
> >>
> >> [1] http://marc.info/?l=linux-raid&m=142387809409798&w=2
> >
> > Staying consistent in using dev_t rather than casting back and forth to
> > int seems a reasonable fix to apply to mdadm. It obviously won't change
> > the issues with the newer kernels, but I don't see any reason why we
> > shouldn't apply that fix to mdadm.
> >
> > Neil any thoughts on this?
>
> I agree that changing "int" to "dev_t" is a good idea.
>
> We should really fix the more general problem too.
>
> On any kernel with /sys/module/md_mod/parameters/new_array
> find_free_devnm avoid trying anything above 511. (1<<9)-1.
>
> If that fails to find a free number, then it should probably try a name
> like "md_NN" and act as though ci->name is set.
>
> Also, when a "name" given for the md array that is longer than 28 bytes
> we need to fall back to choose an array name ourselves even if ci->name
> is set. Start with md_512 and work upwards.
> Rather than probing we should read /sys/block looking for "md_*" and
> maybe choose 1 more than the largest number found.
I'm wondering why udev open the device with major/minor without checking if the
device exists. A simple 'stat' check is neat.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-11 0:45 ` Shaohua Li
@ 2016-05-12 1:55 ` NeilBrown
2016-05-12 5:58 ` Hannes Reinecke
0 siblings, 1 reply; 10+ messages in thread
From: NeilBrown @ 2016-05-12 1:55 UTC (permalink / raw)
To: Shaohua Li; +Cc: Jes Sorensen, Mike Lovell, linux-raid
[-- Attachment #1: Type: text/plain, Size: 2336 bytes --]
On Wed, May 11 2016, Shaohua Li wrote:
> On Wed, May 11, 2016 at 08:39:53AM +1000, NeilBrown wrote:
>> On Wed, May 11 2016, Jes Sorensen wrote:
>>
>> > Mike Lovell <mike.lovell@endurance.com> writes:
>> >> we have a number of systems that have a large number of software
>> >> arrays running. its in the couple hundred range. we have been using a
>> >> custom built kernel based on 3.4 but are wanting to update to a
>> >> mainline kernel and have been experimenting with 4.4. the systems are
>> >> running recent centos 6 releases but we have been downgrading the
>> >> mdadm version from 3.3.2 in 6.7 to a custom build 3.2.6. we installed
>> >> the downgraded version due to a problem with array numbering. i
>> >> emailed the list a while ago explaining the issue and submitting a
>> >> patch to fix [1]. i never heard anything back and since we had a
>> >> simple fix i didn't follow up on it.
>> >
>> > [snip]
>> >
>> >> what do you all think?
>> >>
>> >> thanks
>> >> mike
>> >>
>> >> [1] http://marc.info/?l=linux-raid&m=142387809409798&w=2
>> >
>> > Staying consistent in using dev_t rather than casting back and forth to
>> > int seems a reasonable fix to apply to mdadm. It obviously won't change
>> > the issues with the newer kernels, but I don't see any reason why we
>> > shouldn't apply that fix to mdadm.
>> >
>> > Neil any thoughts on this?
>>
>> I agree that changing "int" to "dev_t" is a good idea.
>>
>> We should really fix the more general problem too.
>>
>> On any kernel with /sys/module/md_mod/parameters/new_array
>> find_free_devnm avoid trying anything above 511. (1<<9)-1.
>>
>> If that fails to find a free number, then it should probably try a name
>> like "md_NN" and act as though ci->name is set.
>>
>> Also, when a "name" given for the md array that is longer than 28 bytes
>> we need to fall back to choose an array name ourselves even if ci->name
>> is set. Start with md_512 and work upwards.
>> Rather than probing we should read /sys/block looking for "md_*" and
>> maybe choose 1 more than the largest number found.
>
> I'm wondering why udev open the device with major/minor without checking if the
> device exists. A simple 'stat' check is neat.
A big part of the role of udev is to create the device nodes in /dev.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: problems with lots of arrays
2016-05-12 1:55 ` NeilBrown
@ 2016-05-12 5:58 ` Hannes Reinecke
0 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2016-05-12 5:58 UTC (permalink / raw)
To: NeilBrown, Shaohua Li; +Cc: Jes Sorensen, Mike Lovell, linux-raid
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 05/12/2016 03:55 AM, NeilBrown wrote:
> On Wed, May 11 2016, Shaohua Li wrote:
>
>> On Wed, May 11, 2016 at 08:39:53AM +1000, NeilBrown wrote:
>>> On Wed, May 11 2016, Jes Sorensen wrote:
>>>
>>>> Mike Lovell <mike.lovell@endurance.com> writes:
>>>>> we have a number of systems that have a large number of
>>>>> software arrays running. its in the couple hundred range.
>>>>> we have been using a custom built kernel based on 3.4 but
>>>>> are wanting to update to a mainline kernel and have been
>>>>> experimenting with 4.4. the systems are running recent
>>>>> centos 6 releases but we have been downgrading the mdadm
>>>>> version from 3.3.2 in 6.7 to a custom build 3.2.6. we
>>>>> installed the downgraded version due to a problem with
>>>>> array numbering. i emailed the list a while ago explaining
>>>>> the issue and submitting a patch to fix [1]. i never heard
>>>>> anything back and since we had a simple fix i didn't follow
>>>>> up on it.
>>>>
>>>> [snip]
>>>>
>>>>> what do you all think?
>>>>>
>>>>> thanks mike
>>>>>
>>>>> [1] http://marc.info/?l=linux-raid&m=142387809409798&w=2
>>>>
>>>> Staying consistent in using dev_t rather than casting back
>>>> and forth to int seems a reasonable fix to apply to mdadm. It
>>>> obviously won't change the issues with the newer kernels, but
>>>> I don't see any reason why we shouldn't apply that fix to
>>>> mdadm.
>>>>
>>>> Neil any thoughts on this?
>>>
>>> I agree that changing "int" to "dev_t" is a good idea.
>>>
>>> We should really fix the more general problem too.
>>>
>>> On any kernel with /sys/module/md_mod/parameters/new_array
>>> find_free_devnm avoid trying anything above 511. (1<<9)-1.
>>>
>>> If that fails to find a free number, then it should probably
>>> try a name like "md_NN" and act as though ci->name is set.
>>>
>>> Also, when a "name" given for the md array that is longer than
>>> 28 bytes we need to fall back to choose an array name ourselves
>>> even if ci->name is set. Start with md_512 and work upwards.
>>> Rather than probing we should read /sys/block looking for
>>> "md_*" and maybe choose 1 more than the largest number found.
>>
>> I'm wondering why udev open the device with major/minor without
>> checking if the device exists. A simple 'stat' check is neat.
>
> A big part of the role of udev is to create the device nodes in
> /dev.
>
Not any more.
devtmpfs will create the device nodes automagically,
no need for udev to interfere.
Cheers,
Hannes
- --
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAEBAgAGBQJXNBttAAoJEGz4yi9OyKjPAHcP/R4+cgYjqcNH4acdGLIR9g4x
ASIieuAEe1E4/OkWcSM1pRkOMHy8fjlXw41EKBePZRshkDAL0avoPkwL6QuCbK6T
jktuTHogNKgplGyBn9ibwKBSeXlw3skDHpYU+BoIoNGJD0fGGUcFHgJQj48ptsny
mpuqyy/1LXFQCYI7Sv96cZg1InA60I6vKEJXUud78InE01bJZWS6eHCh4PH9yq9S
7l+yrnMbtMiz78TtYvgSF77CVIFgJkV8xtWO3AExTDHL1V5UtLXnAI+tS4xOtvsE
vA0NTOpiVplwJVaLEwlgRFaAZYs8zFpiko/TUitQmx7BgaxPVas30h7R0uoWhsQY
ejn6HVSQMvKJQWl0guD1JnU8CrPge36OdJnqDtLOY9Dqwqun97EriGWle8j8zpgo
yJG852IchaMYB/YPhZvag4+hDRVgqVZO4XWyC9j5pk/tPBMwFPMMk7gZSDjDKMHD
Rm11uu1O6itUJr/IDMX5broTAI6e+DJDOUn4imXjSUmTp+cWqq+HSuvQR1/Amg6S
QuOSTJbOeEF+bEM209IMmCQj6pG/Fhw2vODGrnptiFI+g5Rsf6v8/DbOnSxJhnVX
QolSvjMigykY/K7JqvjYBjtCMrwvrEgNHOAUTOD9F70SWatetLA2rAbrl71LHMgS
CByHeGYD9xkd/UzzLKy0
=4WAN
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2016-05-12 5:58 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-05 23:24 problems with lots of arrays Mike Lovell
2016-05-06 6:43 ` NeilBrown
2016-05-06 17:02 ` Mike Lovell
2016-05-06 17:59 ` Mike Lovell
2016-05-06 23:13 ` NeilBrown
2016-05-10 20:48 ` Jes Sorensen
2016-05-10 22:39 ` NeilBrown
2016-05-11 0:45 ` Shaohua Li
2016-05-12 1:55 ` NeilBrown
2016-05-12 5:58 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).