linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* MDADM 3.3 broken?
@ 2013-11-18 18:26 David F.
  2013-11-18 20:22 ` Martin Wilck
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-18 18:26 UTC (permalink / raw)
  To: linux-raid

Hi,

we updated our linux disk with mdadm 3.3 from 3.2.6 and customers are
finding their RAID is no longer detected.  It's only been a couple
weeks and based on the number of customers, we know there is an issue.
 We're having those with problems workaround by having them load
dmraid instead for now.

We also did tests locally and finding intermittent problems with
RAID-0 on ISW - sometimes 3.3 doesn't identify both drives as RAID
members.  3.2.6 works 100% of the time.

Also with DDF RAID - cisco server for example not detecting RAID5 -
C220M3_LFF_SpecSheet.pdf. I believe they are using the LSI MegaRaid
since DMRAID reports that.

Are these problems known - we wouldn't mind moving to the latest
version if your pretty sure it fixes it, otherwise we're going to have
to revert to 3.2.6?

TIA!!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-18 18:26 MDADM 3.3 broken? David F.
@ 2013-11-18 20:22 ` Martin Wilck
  2013-11-18 23:13   ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: Martin Wilck @ 2013-11-18 20:22 UTC (permalink / raw)
  To: David F.; +Cc: linux-raid

On 11/18/2013 07:26 PM, David F. wrote:
> Hi,
> 
> we updated our linux disk with mdadm 3.3 from 3.2.6 and customers are
> finding their RAID is no longer detected.  It's only been a couple
> weeks and based on the number of customers, we know there is an issue.
>  We're having those with problems workaround by having them load
> dmraid instead for now.
> 
> We also did tests locally and finding intermittent problems with
> RAID-0 on ISW - sometimes 3.3 doesn't identify both drives as RAID
> members.  3.2.6 works 100% of the time.
> 
> Also with DDF RAID - cisco server for example not detecting RAID5 -
> C220M3_LFF_SpecSheet.pdf. I believe they are using the LSI MegaRaid
> since DMRAID reports that.

Could you please provide mdadm -E and possibly mdadm --dump output of
the disks that aren't detected? How does RAID discovery work on your
systems? Are you using standard udev rules or something special? How
does your mdadm.conf look like?

Regards
Martin


> 
> Are these problems known - we wouldn't mind moving to the latest
> version if your pretty sure it fixes it, otherwise we're going to have
> to revert to 3.2.6?
> 
> TIA!!
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-18 20:22 ` Martin Wilck
@ 2013-11-18 23:13   ` David F.
  2013-11-19  0:01     ` NeilBrown
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-18 23:13 UTC (permalink / raw)
  To: Martin Wilck; +Cc: linux-raid

Sure - Here is the isw issue we can reproduce below (both versions
output is included):

Comparison of results using mdadm 3.2.6 (OK) vs mdadm 3.3 (not OK)
under otherwise identical conditions
from our boot disk, when trying to assemble an imsm RAID0 array (two
entire 500g drives /dev/sdb /dev/sdc).

The boot disk uses kernel 3.11.7, and udev 175 from the Debian Wheezy
udev package, including
their rules file (64-md-raid.rules).

Note: In this case at least, we get 2 different results when running
mdadm 3.3 under the same
conditions (some kind of race condtion?). The 2 sets of results are
under the headings "output1"
and "output2" below. For output2, the array is succesfully assembled
(uses /dev/sdb and /dev/sdc),
while output1 uses /dev/sdb and /dev/sdc2, and fails to assemble it.


contents of mdadm.conf (when attempting to assemble array):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

ARRAY metadata=imsm UUID=0ee9661b:d0ee3f07:e2f3b890:f658562d
ARRAY /dev/md/RAID0 container=0ee9661b:d0ee3f07:e2f3b890:f658562d
member=0 UUID=540c3a88:7717daff:ccaf97eb:7961ac32


script that starts mdadm on boot (after udev started):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#! /bin/bash

if [ ! -e /proc/mdstat ]; then
    echo "Software RAID drivers not loaded"
    exit 0
fi

if [ ! -e /etc/mdadm/mdadm.conf-default ]; then
    echo "Default config file not found in /etc/mdadm"
    exit 0
else
    cp /etc/mdadm/mdadm.conf-default /etc/mdadm/mdadm.conf
fi

mdadm --examine --scan >> /etc/mdadm/mdadm.conf
mdadm --assemble --scan --no-degraded
echo


output of 'mdadm --examine --scan' (same for both):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ARRAY metadata=imsm UUID=0ee9661b:d0ee3f07:e2f3b890:f658562d
ARRAY /dev/md/RAID0 container=0ee9661b:d0ee3f07:e2f3b890:f658562d
member=0 UUID=540c3a88:7717daff:ccaf97eb:7961ac32


output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mdadm: looking for devices for further assembly
mdadm: no RAID superblock on /dev/sdd4
mdadm: no RAID superblock on /dev/sdd3
mdadm: no RAID superblock on /dev/sdd2
mdadm: no RAID superblock on /dev/sdd1
mdadm: no RAID superblock on /dev/sdd
mdadm: cannot open device /dev/sr1: No medium found
mdadm: no RAID superblock on /dev/sdc2
mdadm: no RAID superblock on /dev/sdc1
mdadm: no RAID superblock on /dev/sda11
mdadm: no RAID superblock on /dev/sda10
mdadm: no RAID superblock on /dev/sda9
mdadm: no RAID superblock on /dev/sda8
mdadm: no RAID superblock on /dev/sda7
mdadm: no RAID superblock on /dev/sda6
mdadm: no RAID superblock on /dev/sda5
mdadm: no RAID superblock on /dev/sda4
mdadm: no RAID superblock on /dev/sda3
mdadm: no RAID superblock on /dev/sda2
mdadm: no RAID superblock on /dev/sda1
mdadm: no RAID superblock on /dev/sda
mdadm: cannot open device /dev/sr0: No medium found
mdadm: /dev/sdc is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdb is identified as a member of /dev/md/imsm0, slot -1.
mdadm: added /dev/sdb to /dev/md/imsm0 as -1
mdadm: added /dev/sdc to /dev/md/imsm0 as -1
mdadm: Container /dev/md/imsm0 has been assembled with 2 drives
mdadm: looking for devices for /dev/md/RAID0
mdadm: looking in container /dev/md127
mdadm: found match on member /md127/0 in /dev/md127
mdadm: Started /dev/md/RAID0 with 2 devices

output of 'dmesg | grep md:' and 'ls -l /dev/sdc*' - mdadm 3.2.6:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: multipath personality registered for level -4
md: md127 stopped.
md: bind<sdb>
md: bind<sdc>
md: md126 stopped.
md: bind<sdb>
md: bind<sdc>
md: RAID0 configuration for md126 - 1 zone
md: zone0=[sdc/sdb]
brw-rw---T    1 root     disk        8,  32 Nov 18 14:59 /dev/sdc
brw-rw---T    1 root     disk        8,  33 Nov 18 09:59 /dev/sdc1
brw-rw---T    1 root     disk        8,  34 Nov 18 09:59 /dev/sdc2

output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
note using /dev/sdc2, not /dev/sdc):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mdadm: looking for devices for further assembly
mdadm: no RAID superblock on /dev/sdd4
mdadm: no RAID superblock on /dev/sdd3
mdadm: no RAID superblock on /dev/sdd2
mdadm: no RAID superblock on /dev/sdd1
mdadm: no RAID superblock on /dev/sdd
mdadm: cannot open device /dev/sr1: No medium found
mdadm: cannot open device /dev/sr0: No medium found
mdadm: no RAID superblock on /dev/sda11
mdadm: no RAID superblock on /dev/sda10
mdadm: no RAID superblock on /dev/sda9
mdadm: no RAID superblock on /dev/sda8
mdadm: no RAID superblock on /dev/sda7
mdadm: no RAID superblock on /dev/sda6
mdadm: no RAID superblock on /dev/sda5
mdadm: no RAID superblock on /dev/sda4
mdadm: no RAID superblock on /dev/sda3
mdadm: no RAID superblock on /dev/sda2
mdadm: no RAID superblock on /dev/sda1
mdadm: no RAID superblock on /dev/sda
mdadm: no RAID superblock on /dev/sdc1
mdadm: /dev/sdb is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdc is identified as a member of /dev/md/imsm0, slot -1.
mdadm: added /dev/sdc2 to /dev/md/imsm0 as -1
mdadm: failed to add /dev/sdc to /dev/md/imsm0: Device or resource busy
mdadm: added /dev/sdb to /dev/md/imsm0 as -1
mdadm: Container /dev/md/imsm0 has been assembled with 2 drives
mdadm: looking for devices for /dev/md/RAID0
mdadm: looking in container /dev/md/imsm0
mdadm: found match on member /md127/0 in /dev/md/imsm0
mdadm: Started /dev/md/RAID0 with 2 devices

output1 of 'dmesg | grep md:' and 'ls -l /dev/sdc*' - mdadm 3.3:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: multipath personality registered for level -4
md: md127 stopped.
md: bind<sdc2>
md: could not open unknown-block(8,32).
md: bind<sdb>
md: md126 stopped.
md: bind<sdb>
md: bind<sdc2>
md: RAID0 configuration for md126 - 1 zone
md: zone0=[sdc2/sdb]
brw-rw---T    1 root     disk        8,  32 Nov 18 15:02 /dev/sdc
brw-rw---T    1 root     disk        8,  33 Nov 18 10:02 /dev/sdc1
brw-rw---T    1 root     disk        8,  34 Nov 18 15:02 /dev/sdc2

output2 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
note using /dev/sdc this time):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mdadm: looking for devices for further assembly
mdadm: no RAID superblock on /dev/sdd4
mdadm: no RAID superblock on /dev/sdd3
mdadm: no RAID superblock on /dev/sdd2
mdadm: no RAID superblock on /dev/sdd1
mdadm: no RAID superblock on /dev/sdd
mdadm: cannot open device /dev/sr1: No medium found
mdadm: cannot open device /dev/sr0: No medium found
mdadm: no RAID superblock on /dev/sdc1
mdadm: no RAID superblock on /dev/sda11
mdadm: no RAID superblock on /dev/sda10
mdadm: no RAID superblock on /dev/sda9
mdadm: no RAID superblock on /dev/sda8
mdadm: no RAID superblock on /dev/sda7
mdadm: no RAID superblock on /dev/sda6
mdadm: no RAID superblock on /dev/sda5
mdadm: no RAID superblock on /dev/sda4
mdadm: no RAID superblock on /dev/sda3
mdadm: no RAID superblock on /dev/sda2
mdadm: no RAID superblock on /dev/sda1
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdc is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdb is identified as a member of /dev/md/imsm0, slot -1.
mdadm: added /dev/sdc to /dev/md/imsm0 as -1
mdadm: added /dev/sdb to /dev/md/imsm0 as -1
mdadm: failed to add /dev/sdc2 to /dev/md/imsm0: Device or resource busy
mdadm: Container /dev/md/imsm0 has been assembled with 2 drives
mdadm: looking for devices for /dev/md/RAID0
mdadm: looking in container /dev/md/imsm0
mdadm: found match on member /md127/0 in /dev/md/imsm0
mdadm: Started /dev/md/RAID0 with 2 devices

output2 of 'dmesg | grep md:' and 'ls -l /dev/sdc*' - mdadm 3.3:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: multipath personality registered for level -4
md: md127 stopped.
md: bind<sdc>
md: bind<sdb>
md: could not open unknown-block(8,34).
md: md126 stopped.
md: bind<sdb>
md: bind<sdc>
md: RAID0 configuration for md126 - 1 zone
md: zone0=[sdc/sdb]
brw-rw---T    1 root     disk        8,  32 Nov 18 14:52 /dev/sdc
brw-rw---T    1 root     disk        8,  33 Nov 18 09:52 /dev/sdc1
brw-rw---T    1 root     disk        8,  34 Nov 18 14:52 /dev/sdc2


On Mon, Nov 18, 2013 at 12:22 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 11/18/2013 07:26 PM, David F. wrote:
>> Hi,
>>
>> we updated our linux disk with mdadm 3.3 from 3.2.6 and customers are
>> finding their RAID is no longer detected.  It's only been a couple
>> weeks and based on the number of customers, we know there is an issue.
>>  We're having those with problems workaround by having them load
>> dmraid instead for now.
>>
>> We also did tests locally and finding intermittent problems with
>> RAID-0 on ISW - sometimes 3.3 doesn't identify both drives as RAID
>> members.  3.2.6 works 100% of the time.
>>
>> Also with DDF RAID - cisco server for example not detecting RAID5 -
>> C220M3_LFF_SpecSheet.pdf. I believe they are using the LSI MegaRaid
>> since DMRAID reports that.
>
> Could you please provide mdadm -E and possibly mdadm --dump output of
> the disks that aren't detected? How does RAID discovery work on your
> systems? Are you using standard udev rules or something special? How
> does your mdadm.conf look like?
>
> Regards
> Martin
>
>
>>
>> Are these problems known - we wouldn't mind moving to the latest
>> version if your pretty sure it fixes it, otherwise we're going to have
>> to revert to 3.2.6?
>>
>> TIA!!
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-18 23:13   ` David F.
@ 2013-11-19  0:01     ` NeilBrown
  2013-11-19 17:05       ` David F.
  2013-11-19 19:45       ` Martin Wilck
  0 siblings, 2 replies; 44+ messages in thread
From: NeilBrown @ 2013-11-19  0:01 UTC (permalink / raw)
  To: David F.; +Cc: Martin Wilck, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1012 bytes --]

On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:


> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
...
> mdadm: no RAID superblock on /dev/sdc2



> 
> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
> note using /dev/sdc2, not /dev/sdc):
.....
> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.

So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
mdadm 3.3 does (but should not).


However that code hasn't changed!

load_super_imsm() still starts with:


	if (test_partition(fd))
		/* IMSM not allowed on partitions */
		return 1;


and test_partition hasn't changed since it was written in April 2010 for
mdadm 3.1.3.

So I'm quite perplexed.

Is your mdadm-3.3 compiled from source or provided by a distro?

Can you run the "mdadm --assemble" under strace and post the result?

  strace -o /tmp/some-file mdadm --assemble --scan ......

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19  0:01     ` NeilBrown
@ 2013-11-19 17:05       ` David F.
  2013-11-19 20:38         ` Martin Wilck
  2013-11-19 19:45       ` Martin Wilck
  1 sibling, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-19 17:05 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

Hello,

The mdadm 3.3 binary we've been using was compiled from source on a Debian
Wheezy 32-bit install, with the source downloaded from kernel.org

But also, this morning, I tried the same test with the 3.3 mdadm binary
taken from the Debian Jessie package here:

packages.debian.org slash jessie slash i386 slash mdadm slash download
(mdadm_3.3-2_i386.deb)

The results were the same, as this output shows:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mdadm: looking for devices for further assembly
mdadm: no RAID superblock on /dev/sdd4
mdadm: no RAID superblock on /dev/sdd3
mdadm: no RAID superblock on /dev/sdd2
mdadm: no RAID superblock on /dev/sdd1
mdadm: no RAID superblock on /dev/sdd
mdadm: cannot open device /dev/sr1: No medium found
mdadm: cannot open device /dev/sr0: No medium found
mdadm: no RAID superblock on /dev/sda11
mdadm: no RAID superblock on /dev/sda10
mdadm: no RAID superblock on /dev/sda9
mdadm: no RAID superblock on /dev/sda8
mdadm: no RAID superblock on /dev/sda7
mdadm: no RAID superblock on /dev/sda6
mdadm: no RAID superblock on /dev/sda5
mdadm: no RAID superblock on /dev/sda4
mdadm: no RAID superblock on /dev/sda3
mdadm: no RAID superblock on /dev/sda2
mdadm: no RAID superblock on /dev/sda1
mdadm: no RAID superblock on /dev/sda
mdadm: no RAID superblock on /dev/sdc1
mdadm: /dev/sdb is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
mdadm: /dev/sdc is identified as a member of /dev/md/imsm0, slot -1.
mdadm: added /dev/sdc2 to /dev/md/imsm0 as -1
mdadm: failed to add /dev/sdc to /dev/md/imsm0: Device or resource busy
mdadm: added /dev/sdb to /dev/md/imsm0 as -1
mdadm: Container /dev/md/imsm0 has been assembled with 2 drives
mdadm: looking for devices for /dev/md/RAID0
mdadm: looking in container /dev/md/imsm0
mdadm: found match on member /md127/0 in /dev/md/imsm0
mdadm: Started /dev/md/RAID0 with 2 devices

I ran strace on this version with this command:
strace -o mdadm33trace.txt mdadm --assemble --scan --no-degraded

Here's a link to the strace output file:
www.dropbox.com slash s slash l5tcgu8zvjn7eb7 slash mdadm33trace.txt

On Mon, Nov 18, 2013 at 4:01 PM, NeilBrown <neilb@suse.de> wrote:
> On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:
>
>
>> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
> ...
>> mdadm: no RAID superblock on /dev/sdc2
>
>
>
>>
>> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
>> note using /dev/sdc2, not /dev/sdc):
> .....
>> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
>
> So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
> mdadm 3.3 does (but should not).
>
>
> However that code hasn't changed!
>
> load_super_imsm() still starts with:
>
>
>         if (test_partition(fd))
>                 /* IMSM not allowed on partitions */
>                 return 1;
>
>
> and test_partition hasn't changed since it was written in April 2010 for
> mdadm 3.1.3.
>
> So I'm quite perplexed.
>
> Is your mdadm-3.3 compiled from source or provided by a distro?
>
> Can you run the "mdadm --assemble" under strace and post the result?
>
>   strace -o /tmp/some-file mdadm --assemble --scan ......
>
> Thanks,
> NeilBrown

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19  0:01     ` NeilBrown
  2013-11-19 17:05       ` David F.
@ 2013-11-19 19:45       ` Martin Wilck
  2013-11-19 20:08         ` David F.
  2013-11-19 23:51         ` NeilBrown
  1 sibling, 2 replies; 44+ messages in thread
From: Martin Wilck @ 2013-11-19 19:45 UTC (permalink / raw)
  To: NeilBrown; +Cc: David F., linux-raid

On 11/19/2013 01:01 AM, NeilBrown wrote:
> On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:
> 
> 
>> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
> ...
>> mdadm: no RAID superblock on /dev/sdc2
> 
> 
> 
>>
>> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
>> note using /dev/sdc2, not /dev/sdc):
> .....
>> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
> 
> So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
> mdadm 3.3 does (but should not).
> 
> 
> However that code hasn't changed!
> 
> load_super_imsm() still starts with:
> 
> 
> 	if (test_partition(fd))
> 		/* IMSM not allowed on partitions */
> 		return 1;

Well not quite - you changed that code in commit b31df436 "intel,ddf:
don't require partitions when ignore_hw_compat is set". Maybe there's
something wrong with that ignore_hw_compat logic?

In the strace I don't see indication of test_partition having been
called, that's another hint in that direction.

Martin

> 
> 
> and test_partition hasn't changed since it was written in April 2010 for
> mdadm 3.1.3.
> 
> So I'm quite perplexed.
> 
> Is your mdadm-3.3 compiled from source or provided by a distro?
> 
> Can you run the "mdadm --assemble" under strace and post the result?
> 
>   strace -o /tmp/some-file mdadm --assemble --scan ......
> 
> Thanks,
> NeilBrown


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19 19:45       ` Martin Wilck
@ 2013-11-19 20:08         ` David F.
  2013-11-19 23:51         ` NeilBrown
  1 sibling, 0 replies; 44+ messages in thread
From: David F. @ 2013-11-19 20:08 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

and don't forget it's not 100% of the time, it's about 50% of the time
3.3 gives a problem so not sure if it's an uninitialized variable or
something like that?   3.2.6 works 100% of the time.  And again on
RAID0 - RAID1 haven't seen a problem.

On Tue, Nov 19, 2013 at 11:45 AM, Martin Wilck <mwilck@arcor.de> wrote:
> On 11/19/2013 01:01 AM, NeilBrown wrote:
>> On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:
>>
>>
>>> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
>> ...
>>> mdadm: no RAID superblock on /dev/sdc2
>>
>>
>>
>>>
>>> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
>>> note using /dev/sdc2, not /dev/sdc):
>> .....
>>> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
>>
>> So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
>> mdadm 3.3 does (but should not).
>>
>>
>> However that code hasn't changed!
>>
>> load_super_imsm() still starts with:
>>
>>
>>       if (test_partition(fd))
>>               /* IMSM not allowed on partitions */
>>               return 1;
>
> Well not quite - you changed that code in commit b31df436 "intel,ddf:
> don't require partitions when ignore_hw_compat is set". Maybe there's
> something wrong with that ignore_hw_compat logic?
>
> In the strace I don't see indication of test_partition having been
> called, that's another hint in that direction.
>
> Martin
>
>>
>>
>> and test_partition hasn't changed since it was written in April 2010 for
>> mdadm 3.1.3.
>>
>> So I'm quite perplexed.
>>
>> Is your mdadm-3.3 compiled from source or provided by a distro?
>>
>> Can you run the "mdadm --assemble" under strace and post the result?
>>
>>   strace -o /tmp/some-file mdadm --assemble --scan ......
>>
>> Thanks,
>> NeilBrown
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19 17:05       ` David F.
@ 2013-11-19 20:38         ` Martin Wilck
  2013-11-19 22:34           ` David F.
  2013-11-19 22:49           ` David F.
  0 siblings, 2 replies; 44+ messages in thread
From: Martin Wilck @ 2013-11-19 20:38 UTC (permalink / raw)
  To: David F.; +Cc: NeilBrown, linux-raid

The question I have is why is there IMSM meta data on sdc2 at all?
IMSM metadata sit at the end of a block device. So I figure that sdc2 is
the last partition, and by some wird circumstance it's so large that it
includes the last sectors of the physical disk where the metadata
resides. That would be a bad idea, a dd if=/dev/zero of=/dev/sdc2 would
wipe not only the partition but also the RAID meta data.

Martin


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19 20:38         ` Martin Wilck
@ 2013-11-19 22:34           ` David F.
  2013-11-19 22:49           ` David F.
  1 sibling, 0 replies; 44+ messages in thread
From: David F. @ 2013-11-19 22:34 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

For cisco's server:

> mdadm --examine --scan
ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
UUID=5337ab03:86ca2abc:d42bfbc8:23626c78

> mdadm --assemble --scan --no-degraded -v >
mdadm: looking for devices for further assembly
mdadm: /dev/md/ddf0 is a container, but we are looking for components
mdadm: no RAID superblock on /dev/sdf
mdadm: no RAID superblock on /dev/md/MegaSR2
mdadm: no RAID superblock on /dev/md/MegaSR1
mdadm: no RAID superblock on /dev/md/MegaSR
mdadm: cannot open device /dev/sr0: No medium found
mdadm: /dev/sdd is busy - skipping
mdadm: /dev/sdc is busy - skipping
mdadm: /dev/sdb is busy - skipping
mdadm: /dev/sda is busy - skipping
mdadm: /dev/sde is busy - skipping
mdadm: looking for devices for further assembly
mdadm: looking in container /dev/md/ddf0
mdadm: member /md127/0 in /dev/md/ddf0 is already assembled
mdadm: Cannot assemble mbr metadata on /dev/sdf
mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR2
mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR1
mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR
mdadm: cannot open device /dev/sr0: No medium found
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sde has wrong uuid.



On Tue, Nov 19, 2013 at 12:38 PM, Martin Wilck <mwilck@arcor.de> wrote:
> The question I have is why is there IMSM meta data on sdc2 at all?
> IMSM metadata sit at the end of a block device. So I figure that sdc2 is
> the last partition, and by some wird circumstance it's so large that it
> includes the last sectors of the physical disk where the metadata
> resides. That would be a bad idea, a dd if=/dev/zero of=/dev/sdc2 would
> wipe not only the partition but also the RAID meta data.
>
> Martin
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19 20:38         ` Martin Wilck
  2013-11-19 22:34           ` David F.
@ 2013-11-19 22:49           ` David F.
  1 sibling, 0 replies; 44+ messages in thread
From: David F. @ 2013-11-19 22:49 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

I believe because the partitions take up the entire provided md device
- so if you don't skew the size of the device to protect the meta
data, the partition could go to the end of the drive.

On Tue, Nov 19, 2013 at 12:38 PM, Martin Wilck <mwilck@arcor.de> wrote:
> The question I have is why is there IMSM meta data on sdc2 at all?
> IMSM metadata sit at the end of a block device. So I figure that sdc2 is
> the last partition, and by some wird circumstance it's so large that it
> includes the last sectors of the physical disk where the metadata
> resides. That would be a bad idea, a dd if=/dev/zero of=/dev/sdc2 would
> wipe not only the partition but also the RAID meta data.
>
> Martin
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19 19:45       ` Martin Wilck
  2013-11-19 20:08         ` David F.
@ 2013-11-19 23:51         ` NeilBrown
  2013-11-20  0:22           ` David F.
  1 sibling, 1 reply; 44+ messages in thread
From: NeilBrown @ 2013-11-19 23:51 UTC (permalink / raw)
  To: Martin Wilck; +Cc: David F., linux-raid

[-- Attachment #1: Type: text/plain, Size: 3118 bytes --]

On Tue, 19 Nov 2013 20:45:47 +0100 Martin Wilck <mwilck@arcor.de> wrote:

> On 11/19/2013 01:01 AM, NeilBrown wrote:
> > On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:
> > 
> > 
> >> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
> > ...
> >> mdadm: no RAID superblock on /dev/sdc2
> > 
> > 
> > 
> >>
> >> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
> >> note using /dev/sdc2, not /dev/sdc):
> > .....
> >> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
> > 
> > So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
> > mdadm 3.3 does (but should not).
> > 
> > 
> > However that code hasn't changed!
> > 
> > load_super_imsm() still starts with:
> > 
> > 
> > 	if (test_partition(fd))
> > 		/* IMSM not allowed on partitions */
> > 		return 1;
> 
> Well not quite - you changed that code in commit b31df436 "intel,ddf:
> don't require partitions when ignore_hw_compat is set". Maybe there's
> something wrong with that ignore_hw_compat logic?
> 
> In the strace I don't see indication of test_partition having been
> called, that's another hint in that direction.
> 

Yes... I seems I was accidentally looking at an old version of mdadm.

I've just committed the following patch which should fix the problem.

(git clone git://neil.brown.name/mdadm/ ; cd mdadm ; make;make install)

Thanks,
NeilBrown

From 357ac1067835d1cdd5f80acc28501db0ffc64957 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 20 Nov 2013 10:49:14 +1100
Subject: [PATCH] IMSM metadata really should be ignored when found on
 partitions.

commit b31df43682216d1c65813eae49ebdd8253db8907
changed load_super_imsm to not insist on finding a partition if
ignore_hw_compat was set.
Unfortunately this is set for '--assemble' so arrays could get
assembled badly.

The comment says this was to allow e.g. --examine of image files.
A better fixes for this is to change test_partitions to not report
a regular file as being a partition.
The errors from the BLKPG ioctl are:

 ENOTTY : not a block device.
 EINVAL : not a whole device (probably a partition)
 ENXIO  : partition doesn't exist (so not a partition)

Reported-by: "David F." <df7729@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/super-intel.c b/super-intel.c
index 7b2406866493..c103ffdd2dd8 100644
--- a/super-intel.c
+++ b/super-intel.c
@@ -4423,7 +4423,7 @@ static int load_super_imsm(struct supertype *st, int fd, char *devname)
 	struct intel_super *super;
 	int rv;
 
-	if (!st->ignore_hw_compat && test_partition(fd))
+	if (test_partition(fd))
 		/* IMSM not allowed on partitions */
 		return 1;
 
diff --git a/util.c b/util.c
index 5f95f1f97c02..b29a3ee7ce47 100644
--- a/util.c
+++ b/util.c
@@ -307,7 +307,7 @@ int test_partition(int fd)
 	if (ioctl(fd, BLKPG, &a) == 0)
 		/* Very unlikely, but not a partition */
 		return 0;
-	if (errno == ENXIO)
+	if (errno == ENXIO || errno == ENOTTY)
 		/* not a partition */
 		return 0;
 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-19 23:51         ` NeilBrown
@ 2013-11-20  0:22           ` David F.
  2013-11-20  0:35             ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-20  0:22 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

Okay, I'll have them build it and try.

Another note on the last reason meta data may be in the partition -
that system has the raid changed from 0 to 1 all the time so perhaps
the old meta data at the end of drive 0 from RAID1 ends up in the
middle of the partition when it's RAID0. ?

On Tue, Nov 19, 2013 at 3:51 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 19 Nov 2013 20:45:47 +0100 Martin Wilck <mwilck@arcor.de> wrote:
>
>> On 11/19/2013 01:01 AM, NeilBrown wrote:
>> > On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:
>> >
>> >
>> >> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
>> > ...
>> >> mdadm: no RAID superblock on /dev/sdc2
>> >
>> >
>> >
>> >>
>> >> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
>> >> note using /dev/sdc2, not /dev/sdc):
>> > .....
>> >> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
>> >
>> > So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
>> > mdadm 3.3 does (but should not).
>> >
>> >
>> > However that code hasn't changed!
>> >
>> > load_super_imsm() still starts with:
>> >
>> >
>> >     if (test_partition(fd))
>> >             /* IMSM not allowed on partitions */
>> >             return 1;
>>
>> Well not quite - you changed that code in commit b31df436 "intel,ddf:
>> don't require partitions when ignore_hw_compat is set". Maybe there's
>> something wrong with that ignore_hw_compat logic?
>>
>> In the strace I don't see indication of test_partition having been
>> called, that's another hint in that direction.
>>
>
> Yes... I seems I was accidentally looking at an old version of mdadm.
>
> I've just committed the following patch which should fix the problem.
>
> (git clone git://neil.brown.name/mdadm/ ; cd mdadm ; make;make install)
>
> Thanks,
> NeilBrown
>
> From 357ac1067835d1cdd5f80acc28501db0ffc64957 Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Wed, 20 Nov 2013 10:49:14 +1100
> Subject: [PATCH] IMSM metadata really should be ignored when found on
>  partitions.
>
> commit b31df43682216d1c65813eae49ebdd8253db8907
> changed load_super_imsm to not insist on finding a partition if
> ignore_hw_compat was set.
> Unfortunately this is set for '--assemble' so arrays could get
> assembled badly.
>
> The comment says this was to allow e.g. --examine of image files.
> A better fixes for this is to change test_partitions to not report
> a regular file as being a partition.
> The errors from the BLKPG ioctl are:
>
>  ENOTTY : not a block device.
>  EINVAL : not a whole device (probably a partition)
>  ENXIO  : partition doesn't exist (so not a partition)
>
> Reported-by: "David F." <df7729@gmail.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/super-intel.c b/super-intel.c
> index 7b2406866493..c103ffdd2dd8 100644
> --- a/super-intel.c
> +++ b/super-intel.c
> @@ -4423,7 +4423,7 @@ static int load_super_imsm(struct supertype *st, int fd, char *devname)
>         struct intel_super *super;
>         int rv;
>
> -       if (!st->ignore_hw_compat && test_partition(fd))
> +       if (test_partition(fd))
>                 /* IMSM not allowed on partitions */
>                 return 1;
>
> diff --git a/util.c b/util.c
> index 5f95f1f97c02..b29a3ee7ce47 100644
> --- a/util.c
> +++ b/util.c
> @@ -307,7 +307,7 @@ int test_partition(int fd)
>         if (ioctl(fd, BLKPG, &a) == 0)
>                 /* Very unlikely, but not a partition */
>                 return 0;
> -       if (errno == ENXIO)
> +       if (errno == ENXIO || errno == ENOTTY)
>                 /* not a partition */
>                 return 0;
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  0:22           ` David F.
@ 2013-11-20  0:35             ` David F.
  2013-11-20  0:48               ` NeilBrown
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-20  0:35 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

FWIW, we confirmed that it appears to be the old RAID1 data it was
finding - when the partition is below half the size it didn't find it
in the partition, when resized just beyond half the size (of the RAID0
md drive), it then started finding it in the partition.

Also, was Cisco's information useful for why their cisco server (using
lsi)  RAID5 won't assembly?  Do you think the patch deals with that as
well?

On Tue, Nov 19, 2013 at 4:22 PM, David F. <df7729@gmail.com> wrote:
> Okay, I'll have them build it and try.
>
> Another note on the last reason meta data may be in the partition -
> that system has the raid changed from 0 to 1 all the time so perhaps
> the old meta data at the end of drive 0 from RAID1 ends up in the
> middle of the partition when it's RAID0. ?
>
> On Tue, Nov 19, 2013 at 3:51 PM, NeilBrown <neilb@suse.de> wrote:
>> On Tue, 19 Nov 2013 20:45:47 +0100 Martin Wilck <mwilck@arcor.de> wrote:
>>
>>> On 11/19/2013 01:01 AM, NeilBrown wrote:
>>> > On Mon, 18 Nov 2013 15:13:58 -0800 "David F." <df7729@gmail.com> wrote:
>>> >
>>> >
>>> >> output of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.2.6):
>>> > ...
>>> >> mdadm: no RAID superblock on /dev/sdc2
>>> >
>>> >
>>> >
>>> >>
>>> >> output1 of 'mdadm --assemble --scan --no-degraded -v' (mdadm 3.3 -
>>> >> note using /dev/sdc2, not /dev/sdc):
>>> > .....
>>> >> mdadm: /dev/sdc2 is identified as a member of /dev/md/imsm0, slot -1.
>>> >
>>> > So there is the problem.  mdadm 3.2.6 sees no RAID superblock on sdc2, while
>>> > mdadm 3.3 does (but should not).
>>> >
>>> >
>>> > However that code hasn't changed!
>>> >
>>> > load_super_imsm() still starts with:
>>> >
>>> >
>>> >     if (test_partition(fd))
>>> >             /* IMSM not allowed on partitions */
>>> >             return 1;
>>>
>>> Well not quite - you changed that code in commit b31df436 "intel,ddf:
>>> don't require partitions when ignore_hw_compat is set". Maybe there's
>>> something wrong with that ignore_hw_compat logic?
>>>
>>> In the strace I don't see indication of test_partition having been
>>> called, that's another hint in that direction.
>>>
>>
>> Yes... I seems I was accidentally looking at an old version of mdadm.
>>
>> I've just committed the following patch which should fix the problem.
>>
>> (git clone git://neil.brown.name/mdadm/ ; cd mdadm ; make;make install)
>>
>> Thanks,
>> NeilBrown
>>
>> From 357ac1067835d1cdd5f80acc28501db0ffc64957 Mon Sep 17 00:00:00 2001
>> From: NeilBrown <neilb@suse.de>
>> Date: Wed, 20 Nov 2013 10:49:14 +1100
>> Subject: [PATCH] IMSM metadata really should be ignored when found on
>>  partitions.
>>
>> commit b31df43682216d1c65813eae49ebdd8253db8907
>> changed load_super_imsm to not insist on finding a partition if
>> ignore_hw_compat was set.
>> Unfortunately this is set for '--assemble' so arrays could get
>> assembled badly.
>>
>> The comment says this was to allow e.g. --examine of image files.
>> A better fixes for this is to change test_partitions to not report
>> a regular file as being a partition.
>> The errors from the BLKPG ioctl are:
>>
>>  ENOTTY : not a block device.
>>  EINVAL : not a whole device (probably a partition)
>>  ENXIO  : partition doesn't exist (so not a partition)
>>
>> Reported-by: "David F." <df7729@gmail.com>
>> Signed-off-by: NeilBrown <neilb@suse.de>
>>
>> diff --git a/super-intel.c b/super-intel.c
>> index 7b2406866493..c103ffdd2dd8 100644
>> --- a/super-intel.c
>> +++ b/super-intel.c
>> @@ -4423,7 +4423,7 @@ static int load_super_imsm(struct supertype *st, int fd, char *devname)
>>         struct intel_super *super;
>>         int rv;
>>
>> -       if (!st->ignore_hw_compat && test_partition(fd))
>> +       if (test_partition(fd))
>>                 /* IMSM not allowed on partitions */
>>                 return 1;
>>
>> diff --git a/util.c b/util.c
>> index 5f95f1f97c02..b29a3ee7ce47 100644
>> --- a/util.c
>> +++ b/util.c
>> @@ -307,7 +307,7 @@ int test_partition(int fd)
>>         if (ioctl(fd, BLKPG, &a) == 0)
>>                 /* Very unlikely, but not a partition */
>>                 return 0;
>> -       if (errno == ENXIO)
>> +       if (errno == ENXIO || errno == ENOTTY)
>>                 /* not a partition */
>>                 return 0;
>>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  0:35             ` David F.
@ 2013-11-20  0:48               ` NeilBrown
  2013-11-20  1:29                 ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: NeilBrown @ 2013-11-20  0:48 UTC (permalink / raw)
  To: David F.; +Cc: Martin Wilck, linux-raid

[-- Attachment #1: Type: text/plain, Size: 916 bytes --]

On Tue, 19 Nov 2013 16:35:44 -0800 "David F." <df7729@gmail.com> wrote:

> FWIW, we confirmed that it appears to be the old RAID1 data it was
> finding - when the partition is below half the size it didn't find it
> in the partition, when resized just beyond half the size (of the RAID0
> md drive), it then started finding it in the partition.
> 
> Also, was Cisco's information useful for why their cisco server (using
> lsi)  RAID5 won't assembly?  Do you think the patch deals with that as
> well?
> 
>

You did say something about Cisco in the original post but it wasn't at all
clear what you were asking or what the context was.

Could you please spell it all out.  You said something about dmraid reporting
something.  Showing the output of dmraid in that case wouldn't hurt.
And show the output of "mdadm --examine /dev/device" for any relevant device
is often a good idea.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  0:48               ` NeilBrown
@ 2013-11-20  1:29                 ` David F.
  2013-11-20  1:34                   ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-20  1:29 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

Hi,

Yes, it's in this thread already - but I'll repost here:

For cisco's server:

> mdadm --examine --scan
ARRAY metadata=ddf UUID=7ab254d0:fae71048:
404edde9:750a8a05
ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
UUID=5337ab03:86ca2abc:d42bfbc8:23626c78

> mdadm --assemble --scan --no-degraded -v
mdadm: looking for devices for further assembly
mdadm: /dev/md/ddf0 is a container, but we are looking for components
mdadm: no RAID superblock on /dev/sdf
mdadm: no RAID superblock on /dev/md/MegaSR2
mdadm: no RAID superblock on /dev/md/MegaSR1
mdadm: no RAID superblock on /dev/md/MegaSR
mdadm: cannot open device /dev/sr0: No medium found
mdadm: /dev/sdd is busy - skipping
mdadm: /dev/sdc is busy - skipping
mdadm: /dev/sdb is busy - skipping
mdadm: /dev/sda is busy - skipping
mdadm: /dev/sde is busy - skipping
mdadm: looking for devices for further assembly
mdadm: looking in container /dev/md/ddf0
mdadm: member /md127/0 in /dev/md/ddf0 is already assembled
mdadm: Cannot assemble mbr metadata on /dev/sdf
mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR2
mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR1
mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR
mdadm: cannot open device /dev/sr0: No medium found
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: /dev/sdb has wrong uuid.
mdadm: /dev/sda has wrong uuid.
mdadm: /dev/sde has wrong uuid.




On Tue, Nov 19, 2013 at 4:48 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 19 Nov 2013 16:35:44 -0800 "David F." <df7729@gmail.com> wrote:
>>
>> Also, was Cisco's information useful for why their cisco server (using
>> lsi)  RAID5 won't assembly?  Do you think the patch deals with that as
>> well?
>>
>>
>
> You did say something about Cisco in the original post but it wasn't at all
> clear what you were asking or what the context was.
>
> Could you please spell it all out.  You said something about dmraid reporting
> something.  Showing the output of dmraid in that case wouldn't hurt.
> And show the output of "mdadm --examine /dev/device" for any relevant device
> is often a good idea.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  1:29                 ` David F.
@ 2013-11-20  1:34                   ` David F.
  2013-11-20  2:30                     ` NeilBrown
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-20  1:34 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

And here's some more data I have for them (again they have RAID5):

Storage devices (non-RAID) detected by Linux based on /sys/block:

Optical drives (srx):
  sr0: hp       DVDRAM GT30L     (fw rev = mP06)
Removable drives (sdx):
  sdb: 7453 MiB          Patriot Memory
Fixed hard drives (sdx):
  sdc: 136 GiB TOSHIBA  MK1401GRRB
  sdd: 136 GiB TOSHIBA  MK1401GRRB
  sde: 136 GiB TOSHIBA  MK1401GRRB
  sdf: 136 GiB TOSHIBA  MK1401GRRB
  sdg: 136 GiB TOSHIBA  MK1401GRRB

Contents of /sys/block:
md127  ram0   ram1   sdb    sdc    sdd    sde    sdf    sdg    sr0

Contents of /dev:
block               port                tty1                tty47
bsg                 ppp                 tty10               tty48
bus                 psaux               tty11               tty49
cdrom               ptmx                tty12               tty5
cdrom1              ptp0                tty13               tty50
cdrw                ptp1                tty14               tty51
cdrw1               pts                 tty15               tty52
char                ram0                tty16               tty53
console             ram1                tty17               tty54
core                random              tty18               tty55
cpu_dma_latency     rfkill              tty19               tty56
disk                rtc                 tty2                tty57
dvd                 rtc0                tty20               tty58
dvd1                sda                 tty21               tty59
dvdrw               sdb                 tty22               tty6
dvdrw1              sdb1                tty23               tty60
fd                  sdc                 tty24               tty61
full                sdc1                tty25               tty62
fuse                sdc2                tty26               tty63
hidraw0             sdd                 tty27               tty7
hidraw1             sde                 tty28               tty8
hidraw2             sdf                 tty29               tty9
hidraw3             sdf1                tty3                ttyS0
hidraw4             sdf2                tty30               ttyS1
hidraw5             sdg                 tty31               ttyS2
hpet                sg0                 tty32               ttyS3
input               sg1                 tty33               urandom
kmsg                sg2                 tty34               vcs
log                 sg3                 tty35               vcs1
loop-control        sg4                 tty36               vcs2
loop0               sg5                 tty37               vcs3
loop1               sg6                 tty38               vcs4
mapper              sg7                 tty39               vcs5
mcelog              shm                 tty4                vcsa
md                  sr0                 tty40               vcsa1
md127               stderr              tty41               vcsa2
mem                 stdin               tty42               vcsa3
net                 stdout              tty43               vcsa4
network_latency     synth               tty44               vcsa5
network_throughput  tty                 tty45               vga_arbiter
null                tty0                tty46               zero

Contents of /proc/partitions:
major minor  #blocks  name

   8       32  143638992 sdc
   8       33     102400 sdc1
   8       34  143535568 sdc2
   8       48  143638992 sdd
   8       64  143638992 sde
   8       80  143638992 sdf
   8       81     102400 sdf1
   8       82  143535568 sdf2
   8       96  143638992 sdg
  11        0      48160 sr0
   8       16    7632892 sdb

Disk /dev/sdc: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdc1   *        2048      206847      102400   7 HPFS/NTFS
Partition 1 does not end on cylinder boundary
/dev/sdc2          206848   855463935   427628544   7 HPFS/NTFS

Disk /dev/sdd: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sde doesn't contain a valid partition table

Disk /dev/sdf: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdf1   *        2048      206847      102400   7 HPFS/NTFS
Partition 1 does not end on cylinder boundary
/dev/sdf2          206848   855463935   427628544   7 HPFS/NTFS

Disk /dev/sdg: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdg doesn't contain a valid partition table

Disk /dev/sdb: 7816 MB, 7816081408 bytes
241 heads, 62 sectors/track, 1021 cylinders, total 15265784 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdb1   ?   778135908  1919645538   570754815+ 72 Unknown
Partition 1 has different physical/logical beginnings (non-Linux?):
     phys=(357, 116, 40) logical=(52077, 22, 11)
Partition 1 has different physical/logical endings:
     phys=(357, 32, 45) logical=(128473, 31, 51)
Partition 1 does not end on cylinder boundary
/dev/sdb2   ?   168689522  2104717761   968014120  65 Unknown
Partition 2 has different physical/logical beginnings (non-Linux?):
     phys=(288, 115, 43) logical=(11289, 149, 47)
Partition 2 has different physical/logical endings:
     phys=(367, 114, 50) logical=(140859, 41, 42)
Partition 2 does not end on cylinder boundary
/dev/sdb3   ?  1869881465  3805909656   968014096  79 Unknown
Partition 3 has different physical/logical beginnings (non-Linux?):
     phys=(366, 32, 33) logical=(125142, 156, 30)
Partition 3 has different physical/logical endings:
     phys=(357, 32, 43) logical=(254712, 47, 39)
Partition 3 does not end on cylinder boundary
/dev/sdb4   ?  2885681152  2885736650       27749+  d Unknown
Partition 4 has different physical/logical beginnings (non-Linux?):
     phys=(372, 97, 50) logical=(193125, 119, 25)
Partition 4 has different physical/logical endings:
     phys=(0, 10, 0) logical=(193129, 50, 33)
Partition 4 does not end on cylinder boundary

Partition table entries are not in disk order

Contents of /proc/mdstat (Linux software RAID status):
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4] [multipath]
md127 : inactive sdg[0](S)
      1061328 blocks super external:ddf

unused devices: <none>

Contents of /run/mdadm/map (Linux software RAID arrays):
md127 ddf 7ab254d0:fae71048:404edde9:750a8a05 /dev/md/ddf0

Contents of /etc/mdadm/mdadm.conf (Linux software RAID config file):
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
UUID=45b3ab73:5c998afc:01bbf815:12660984

Long listing of /dev/md directory and /dev/md* device files (Linux
software RAID devices):
brw-rw---T    1 root     disk        9, 127 Nov 18 17:12 /dev/md127

/dev/md:
lrwxrwxrwx    1 root     root             8 Nov 18 17:12 ddf0 -> ../md127

Contents of /dev/mapper directory:
crw------T    1 root     root       10, 236 Nov 18 17:12 control

startraid mode = null (default)

Contents of dmraid-boot.txt file (from activate command on boot):
NOTICE: checking format identifier asr
NOTICE: checking format identifier hpt37x
NOTICE: checking format identifier hpt45x
NOTICE: checking format identifier jmicron
NOTICE: checking format identifier lsi
NOTICE: checking format identifier nvidia
NOTICE: checking format identifier pdc
NOTICE: checking format identifier sil
NOTICE: checking format identifier via
NOTICE: checking format identifier dos
WARN: locking /var/lock/dmraid/.lock
NOTICE: skipping removable device /dev/sda
NOTICE: /dev/sdg: asr     discovering
NOTICE: /dev/sdg: hpt37x  discovering
NOTICE: /dev/sdg: hpt45x  discovering
NOTICE: /dev/sdg: jmicron discovering
NOTICE: /dev/sdg: lsi     discovering
NOTICE: /dev/sdg: nvidia  discovering
NOTICE: /dev/sdg: pdc     discovering
NOTICE: /dev/sdg: sil     discovering
NOTICE: /dev/sdg: via     discovering
NOTICE: /dev/sdf: asr     discovering
NOTICE: /dev/sdf: hpt37x  discovering
NOTICE: /dev/sdf: hpt45x  discovering
NOTICE: /dev/sdf: jmicron discovering
NOTICE: /dev/sdf: lsi     discovering
NOTICE: /dev/sdf: nvidia  discovering
NOTICE: /dev/sdf: pdc     discovering
NOTICE: /dev/sdf: sil     discovering
NOTICE: /dev/sdf: via     discovering
NOTICE: /dev/sde: asr     discovering
NOTICE: /dev/sde: hpt37x  discovering
NOTICE: /dev/sde: hpt45x  discovering
NOTICE: /dev/sde: jmicron discovering
NOTICE: /dev/sde: lsi     discovering
NOTICE: /dev/sde: nvidia  discovering
NOTICE: /dev/sde: pdc     discovering
NOTICE: /dev/sde: sil     discovering
NOTICE: /dev/sde: via     discovering
NOTICE: /dev/sdd: asr     discovering
NOTICE: /dev/sdd: hpt37x  discovering
NOTICE: /dev/sdd: hpt45x  discovering
NOTICE: /dev/sdd: jmicron discovering
NOTICE: /dev/sdd: lsi     discovering
NOTICE: /dev/sdd: nvidia  discovering
NOTICE: /dev/sdd: pdc     discovering
NOTICE: /dev/sdd: sil     discovering
NOTICE: /dev/sdd: via     discovering
NOTICE: /dev/sdc: asr     discovering
NOTICE: /dev/sdc: hpt37x  discovering
NOTICE: /dev/sdc: hpt45x  discovering
NOTICE: /dev/sdc: jmicron discovering
NOTICE: /dev/sdc: lsi     discovering
NOTICE: /dev/sdc: nvidia  discovering
NOTICE: /dev/sdc: pdc     discovering
NOTICE: /dev/sdc: sil     discovering
NOTICE: /dev/sdc: via     discovering
NOTICE: /dev/sdb: asr     discovering
NOTICE: /dev/sdb: hpt37x  discovering
NOTICE: /dev/sdb: hpt45x  discovering
NOTICE: /dev/sdb: jmicron discovering
NOTICE: /dev/sdb: lsi     discovering
NOTICE: /dev/sdb: nvidia  discovering
NOTICE: /dev/sdb: pdc     discovering
NOTICE: /dev/sdb: sil     discovering
NOTICE: /dev/sdb: via     discovering
no raid disks with format:
"asr,hpt37x,hpt45x,jmicron,lsi,nvidia,pdc,sil,via,dos"
WARN: unlocking /var/lock/dmraid/.lock

Output of dmraid --raid_devices command:
/dev/sdg: ddf1, ".ddf1_disks", GROUP, unknown, 285155328 sectors, data@ 0
/dev/sdf: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sde: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdd: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdc: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0

Output of dmraid -s -s command:
*** Group superset .ddf1_disks
--> Subset
name   : ddf1_MegaSR   R5 #0
size   : 855465984
stride : 128
type   : raid5_ls
status : ok
subsets: 0
devs   : 4
spares : 0

Output of blkid command:
/dev/sdd: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sde: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sr0: LABEL="iflnet" TYPE="iso9660"
/dev/sdg: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sdb: LABEL="CS SUPPORT" UUID="4CC3-3665" TYPE="vfat"

Output of free command (memory):

             total       used       free     shared    buffers     cached
Mem:       2039280     185776    1853504          0      40996      76512
-/+ buffers/cache:      68268    1971012
Swap:            0          0          0

Output of lsmod command (loaded modules):

Module                  Size  Used by    Not tainted
sr_mod                 10387  0
cdrom                  24577  1 sr_mod
sg                     17182  0
sd_mod                 25922  2
crc_t10dif               996  1 sd_mod
usb_storage            32002  1
hid_generic              665  0
usbhid                 18838  0
hid                    60158  2 hid_generic,usbhid
pcspkr                  1227  0
igb                    98565  0
i2c_algo_bit            3747  1 igb
ptp                     5420  1 igb
pps_core                4544  1 ptp
i2c_i801                7257  0
i2c_core               13114  3 igb,i2c_algo_bit,i2c_i801
ehci_pci                2540  0
isci                   71218  1
libsas                 47529  1 isci
processor              14951  8
thermal_sys            12739  1 processor
scsi_transport_sas     16455  2 isci,libsas
hwmon                    881  2 igb,thermal_sys
button                  3413  0
dm_mod                 50912  0
ehci_hcd               28628  1 ehci_pci
edd                     5144  0

Contents of /sys/module:
8250                i2c_algo_bit        ptp                 speakup_dtlk
acpi                i2c_core            raid1               speakup_dummy
auth_rpcgss         i2c_i801            raid10              speakup_keypc
block               i8042               random              speakup_ltlk
brd                 igb                 rcupdate            speakup_soft
button              input_polldev       rcutree             speakup_spkout
cdrom               isci                rfkill              speakup_txprt
cifs                kernel              scsi_mod            spurious
cpuidle             keyboard            scsi_transport_fc   sr_mod
crc_t10dif          libata              scsi_transport_sas  sunrpc
dm_mod              libsas              sd_mod              tcp_cubic
dns_resolver        lockd               sg                  thermal_sys
edd                 md_mod              speakup             usb_storage
ehci_hcd            mousedev            speakup_acntpc      usbcore
ehci_pci            nfs                 speakup_acntsa      usbhid
firmware_class      pcie_aspm           speakup_apollo      vt
fuse                pcspkr              speakup_audptr      workqueue
hid                 pps_core            speakup_bns         xz_dec
hid_generic         printk              speakup_decext
hwmon               processor           speakup_dectlk

On Tue, Nov 19, 2013 at 5:29 PM, David F. <df7729@gmail.com> wrote:
> Hi,
>
> Yes, it's in this thread already - but I'll repost here:
>
> For cisco's server:
>
>> mdadm --examine --scan
> ARRAY metadata=ddf UUID=7ab254d0:fae71048:
> 404edde9:750a8a05
> ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
> UUID=5337ab03:86ca2abc:d42bfbc8:23626c78
>
>> mdadm --assemble --scan --no-degraded -v
> mdadm: looking for devices for further assembly
> mdadm: /dev/md/ddf0 is a container, but we are looking for components
> mdadm: no RAID superblock on /dev/sdf
> mdadm: no RAID superblock on /dev/md/MegaSR2
> mdadm: no RAID superblock on /dev/md/MegaSR1
> mdadm: no RAID superblock on /dev/md/MegaSR
> mdadm: cannot open device /dev/sr0: No medium found
> mdadm: /dev/sdd is busy - skipping
> mdadm: /dev/sdc is busy - skipping
> mdadm: /dev/sdb is busy - skipping
> mdadm: /dev/sda is busy - skipping
> mdadm: /dev/sde is busy - skipping
> mdadm: looking for devices for further assembly
> mdadm: looking in container /dev/md/ddf0
> mdadm: member /md127/0 in /dev/md/ddf0 is already assembled
> mdadm: Cannot assemble mbr metadata on /dev/sdf
> mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR2
> mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR1
> mdadm: Cannot assemble mbr metadata on /dev/md/MegaSR
> mdadm: cannot open device /dev/sr0: No medium found
> mdadm: /dev/sdd has wrong uuid.
> mdadm: /dev/sdc has wrong uuid.
> mdadm: /dev/sdb has wrong uuid.
> mdadm: /dev/sda has wrong uuid.
> mdadm: /dev/sde has wrong uuid.
>
>
>
>
> On Tue, Nov 19, 2013 at 4:48 PM, NeilBrown <neilb@suse.de> wrote:
>> On Tue, 19 Nov 2013 16:35:44 -0800 "David F." <df7729@gmail.com> wrote:
>>>
>>> Also, was Cisco's information useful for why their cisco server (using
>>> lsi)  RAID5 won't assembly?  Do you think the patch deals with that as
>>> well?
>>>
>>>
>>
>> You did say something about Cisco in the original post but it wasn't at all
>> clear what you were asking or what the context was.
>>
>> Could you please spell it all out.  You said something about dmraid reporting
>> something.  Showing the output of dmraid in that case wouldn't hurt.
>> And show the output of "mdadm --examine /dev/device" for any relevant device
>> is often a good idea.
>>
>> NeilBrown

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  1:34                   ` David F.
@ 2013-11-20  2:30                     ` NeilBrown
  2013-11-20  6:41                       ` David F.
  2013-11-21 20:46                       ` Martin Wilck
  0 siblings, 2 replies; 44+ messages in thread
From: NeilBrown @ 2013-11-20  2:30 UTC (permalink / raw)
  To: David F.; +Cc: Martin Wilck, linux-raid

[-- Attachment #1: Type: text/plain, Size: 3094 bytes --]

On Tue, 19 Nov 2013 17:34:29 -0800 "David F." <df7729@gmail.com> wrote:


> Contents of /proc/partitions:
> major minor  #blocks  name
> 
>    8       32  143638992 sdc
>    8       33     102400 sdc1
>    8       34  143535568 sdc2
>    8       48  143638992 sdd
>    8       64  143638992 sde
>    8       80  143638992 sdf
>    8       81     102400 sdf1
>    8       82  143535568 sdf2
>    8       96  143638992 sdg
>   11        0      48160 sr0
>    8       16    7632892 sdb

This seems to suggest that there are no md devices that are active.


> Contents of /proc/mdstat (Linux software RAID status):
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
> [raid4] [multipath]
> md127 : inactive sdg[0](S)
>       1061328 blocks super external:ddf
> 
> unused devices: <none>

And this confirms it - just md127 which is inactive and is a ddf 'container'.

> Contents of /etc/mdadm/mdadm.conf (Linux software RAID config file):
> # mdadm.conf
> #
> # Please refer to mdadm.conf(5) for information about this file.
> #
> 
> # by default (built-in), scan all partitions (/proc/partitions) and all
> # containers for MD superblocks. alternatively, specify devices to scan, using
> # wildcards if desired.
> DEVICE partitions containers
> 
> # automatically tag new arrays as belonging to the local system
> HOMEHOST <system>
> 
> ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
> ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
> UUID=45b3ab73:5c998afc:01bbf815:12660984

This shows that mdadm is expecting a container with
      UUID=7ab254d0:fae71048:404edde9:750a8a05
which is presumably found, and a member with
      UUID=45b3ab73:5c998afc:01bbf815:12660984
which it presumably has not found.

> >
> >> mdadm --examine --scan
> > ARRAY metadata=ddf UUID=7ab254d0:fae71048:
> > 404edde9:750a8a05
> > ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
> > UUID=5337ab03:86ca2abc:d42bfbc8:23626c78

This shows that mdadm found a container with the correct UUID, but the member
array inside the container has the wrong uuid.

Martin: I think one of your recent changes would have changed the member UUID
for some specific arrays because the one that was being created before wasn't
reliably stable.  Could  that apply to David's situation?

David: if you remove the "UUID=" part for the array leaving the
"container=.... member=0" as the identification, does it work?


> >
> >> mdadm --assemble --scan --no-degraded -v
> > mdadm: looking for devices for further assembly
> > mdadm: /dev/md/ddf0 is a container, but we are looking for components
> > mdadm: no RAID superblock on /dev/sdf
> > mdadm: no RAID superblock on /dev/md/MegaSR2
> > mdadm: no RAID superblock on /dev/md/MegaSR1
> > mdadm: no RAID superblock on /dev/md/MegaSR

This seems to suggest that there were 3 md arrays active, where as the
previous data didn't show that.  So it seems the two sets of information are
inconsistent and any conclusions I draw are uncertain.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  2:30                     ` NeilBrown
@ 2013-11-20  6:41                       ` David F.
  2013-11-20 23:15                         ` David F.
  2013-11-21 20:46                       ` Martin Wilck
  1 sibling, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-20  6:41 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

on the raid0 isw - the patch seems to work.

On Tue, Nov 19, 2013 at 6:30 PM, NeilBrown <neilb@suse.de> wrote:
> On Tue, 19 Nov 2013 17:34:29 -0800 "David F." <df7729@gmail.com> wrote:
>
>
>> Contents of /proc/partitions:
>> major minor  #blocks  name
>>
>>    8       32  143638992 sdc
>>    8       33     102400 sdc1
>>    8       34  143535568 sdc2
>>    8       48  143638992 sdd
>>    8       64  143638992 sde
>>    8       80  143638992 sdf
>>    8       81     102400 sdf1
>>    8       82  143535568 sdf2
>>    8       96  143638992 sdg
>>   11        0      48160 sr0
>>    8       16    7632892 sdb
>
> This seems to suggest that there are no md devices that are active.
>
>
>> Contents of /proc/mdstat (Linux software RAID status):
>> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
>> [raid4] [multipath]
>> md127 : inactive sdg[0](S)
>>       1061328 blocks super external:ddf
>>
>> unused devices: <none>
>
> And this confirms it - just md127 which is inactive and is a ddf 'container'.
>
>> Contents of /etc/mdadm/mdadm.conf (Linux software RAID config file):
>> # mdadm.conf
>> #
>> # Please refer to mdadm.conf(5) for information about this file.
>> #
>>
>> # by default (built-in), scan all partitions (/proc/partitions) and all
>> # containers for MD superblocks. alternatively, specify devices to scan, using
>> # wildcards if desired.
>> DEVICE partitions containers
>>
>> # automatically tag new arrays as belonging to the local system
>> HOMEHOST <system>
>>
>> ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
>> ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
>> UUID=45b3ab73:5c998afc:01bbf815:12660984
>
> This shows that mdadm is expecting a container with
>       UUID=7ab254d0:fae71048:404edde9:750a8a05
> which is presumably found, and a member with
>       UUID=45b3ab73:5c998afc:01bbf815:12660984
> which it presumably has not found.
>
>> >
>> >> mdadm --examine --scan
>> > ARRAY metadata=ddf UUID=7ab254d0:fae71048:
>> > 404edde9:750a8a05
>> > ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
>> > UUID=5337ab03:86ca2abc:d42bfbc8:23626c78
>
> This shows that mdadm found a container with the correct UUID, but the member
> array inside the container has the wrong uuid.
>
> Martin: I think one of your recent changes would have changed the member UUID
> for some specific arrays because the one that was being created before wasn't
> reliably stable.  Could  that apply to David's situation?
>
> David: if you remove the "UUID=" part for the array leaving the
> "container=.... member=0" as the identification, does it work?
>
>
>> >
>> >> mdadm --assemble --scan --no-degraded -v
>> > mdadm: looking for devices for further assembly
>> > mdadm: /dev/md/ddf0 is a container, but we are looking for components
>> > mdadm: no RAID superblock on /dev/sdf
>> > mdadm: no RAID superblock on /dev/md/MegaSR2
>> > mdadm: no RAID superblock on /dev/md/MegaSR1
>> > mdadm: no RAID superblock on /dev/md/MegaSR
>
> This seems to suggest that there were 3 md arrays active, where as the
> previous data didn't show that.  So it seems the two sets of information are
> inconsistent and any conclusions I draw are uncertain.
>
> NeilBrown
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  6:41                       ` David F.
@ 2013-11-20 23:15                         ` David F.
  2013-11-21 20:50                           ` Martin Wilck
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-20 23:15 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid

I have the updated info for that cisco server with the lsi controller
- This was run all at the same time, using the latest patch that
worked on the isw system.

Storage devices (non-RAID) detected by Linux based on /sys/block:

Optical drives (srx):
  sr0: hp       DVDRAM GT30L     (fw rev = mP06)
Removable drives (sdx):
  sdg: 7453 MiB          Patriot Memory
Fixed hard drives (sdx):
  sda: 1863 GiB WD       Ext HDD 1021
  sdb: 136 GiB TOSHIBA  MK1401GRRB
  sdc: 136 GiB TOSHIBA  MK1401GRRB
  sdd: 136 GiB TOSHIBA  MK1401GRRB
  sde: 136 GiB TOSHIBA  MK1401GRRB
  sdf: 136 GiB TOSHIBA  MK1401GRRB

Contents of /sys/block:
md127  ram0   ram1   sda    sdb    sdc    sdd    sde    sdf    sdg    sr0

Contents of /dev:
block               port                tty1                tty47
bsg                 ppp                 tty10               tty48
bus                 psaux               tty11               tty49
cdrom               ptmx                tty12               tty5
cdrom1              ptp0                tty13               tty50
cdrw                ptp1                tty14               tty51
cdrw1               pts                 tty15               tty52
char                ram0                tty16               tty53
console             ram1                tty17               tty54
core                random              tty18               tty55
cpu_dma_latency     rfkill              tty19               tty56
disk                rtc                 tty2                tty57
dvd                 rtc0                tty20               tty58
dvd1                sda                 tty21               tty59
dvdrw               sda1                tty22               tty6
dvdrw1              sdb                 tty23               tty60
fd                  sdb1                tty24               tty61
full                sdb2                tty25               tty62
fuse                sdc                 tty26               tty63
hidraw0             sdd                 tty27               tty7
hidraw1             sde                 tty28               tty8
hidraw2             sde1                tty29               tty9
hidraw3             sde2                tty3                ttyS0
hidraw4             sdf                 tty30               ttyS1
hidraw5             sdg                 tty31               ttyS2
hpet                sg0                 tty32               ttyS3
input               sg1                 tty33               urandom
kmsg                sg2                 tty34               vcs
log                 sg3                 tty35               vcs1
loop-control        sg4                 tty36               vcs2
loop0               sg5                 tty37               vcs3
loop1               sg6                 tty38               vcs4
mapper              sg7                 tty39               vcs5
mcelog              shm                 tty4                vcsa
md                  sr0                 tty40               vcsa1
md127               stderr              tty41               vcsa2
mem                 stdin               tty42               vcsa3
net                 stdout              tty43               vcsa4
network_latency     synth               tty44               vcsa5
network_throughput  tty                 tty45               vga_arbiter
null                tty0                tty46               zero

Contents of /proc/partitions:
major minor  #blocks  name

   8        0 1953512448 sda
   8        1 1953511424 sda1
   8       16  143638992 sdb
   8       17     102400 sdb1
   8       18  143535568 sdb2
   8       32  143638992 sdc
   8       48  143638992 sdd
   8       64  143638992 sde
   8       65     102400 sde1
   8       66  143535568 sde2
   8       80  143638992 sdf
  11        0      48608 sr0
   8       96    7632892 sdg

Disk /dev/sda: 2000.3 GB, 2000396746752 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907024896 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sda1            2048  3907024895  1953511424   7 HPFS/NTFS

Disk /dev/sdb: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdb1   *        2048      206847      102400   7 HPFS/NTFS
Partition 1 does not end on cylinder boundary
/dev/sdb2          206848   855463935   427628544   7 HPFS/NTFS

Disk /dev/sdc: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sde1   *        2048      206847      102400   7 HPFS/NTFS
Partition 1 does not end on cylinder boundary
/dev/sde2          206848   855463935   427628544   7 HPFS/NTFS

Disk /dev/sdf: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 7816 MB, 7816081408 bytes
241 heads, 62 sectors/track, 1021 cylinders, total 15265784 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdg1   ?   778135908  1919645538   570754815+ 72 Unknown
Partition 1 has different physical/logical beginnings (non-Linux?):
     phys=(357, 116, 40) logical=(52077, 22, 11)
Partition 1 has different physical/logical endings:
     phys=(357, 32, 45) logical=(128473, 31, 51)
Partition 1 does not end on cylinder boundary
/dev/sdg2   ?   168689522  2104717761   968014120  65 Unknown
Partition 2 has different physical/logical beginnings (non-Linux?):
     phys=(288, 115, 43) logical=(11289, 149, 47)
Partition 2 has different physical/logical endings:
     phys=(367, 114, 50) logical=(140859, 41, 42)
Partition 2 does not end on cylinder boundary
/dev/sdg3   ?  1869881465  3805909656   968014096  79 Unknown
Partition 3 has different physical/logical beginnings (non-Linux?):
     phys=(366, 32, 33) logical=(125142, 156, 30)
Partition 3 has different physical/logical endings:
     phys=(357, 32, 43) logical=(254712, 47, 39)
Partition 3 does not end on cylinder boundary
/dev/sdg4   ?  2885681152  2885736650       27749+  d Unknown
Partition 4 has different physical/logical beginnings (non-Linux?):
     phys=(372, 97, 50) logical=(193125, 119, 25)
Partition 4 has different physical/logical endings:
     phys=(0, 10, 0) logical=(193129, 50, 33)
Partition 4 does not end on cylinder boundary

Partition table entries are not in disk order

Contents of /proc/mdstat (Linux software RAID status):
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4] [multipath]
md127 : inactive sdf[0](S)
      1061328 blocks super external:ddf

unused devices: <none>

Contents of /run/mdadm/map (Linux software RAID arrays):
md127 ddf 7ab254d0:fae71048:404edde9:750a8a05 /dev/md/ddf0

Contents of /etc/mdadm/mdadm.conf (Linux software RAID config file):
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
ARRAY /dev/md/MegaSR container=7ab254d0:fae71048:404edde9:750a8a05
member=0 UUID=9515a157:e69d2ad3:8427131f:92b29c7a

Long listing of /dev/md directory and /dev/md* device files (Linux
software RAID devices):
brw-rw---T    1 root     disk        9, 127 Nov 20 15:06 /dev/md127

/dev/md:
lrwxrwxrwx    1 root     root             8 Nov 20 15:06 ddf0 -> ../md127

Contents of /tbu/utility/mdadm.txt (mdadm troubleshooting data
captured when 'start-md' is executed):
mdadm - v3.3-32-g357ac10 - 20th November 2013
Output of 'mdadm --examine --scan'
ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
ARRAY /dev/md/MegaSR container=7ab254d0:fae71048:404edde9:750a8a05
member=0 UUID=9515a157:e69d2ad3:8427131f:92b29c7a
Output of 'mdadm --assemble --scan --no-degraded -v'
mdadm: looking for devices for further assembly
mdadm: no RAID superblock on /dev/sr0
mdadm: no RAID superblock on /dev/sde2
mdadm: no RAID superblock on /dev/sde1
mdadm: no RAID superblock on /dev/sdb2
mdadm: no RAID superblock on /dev/sdb1
mdadm: no RAID superblock on /dev/sda1
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sdf is identified as a member of /dev/md/ddf0, slot 4.
mdadm: /dev/sde is identified as a member of /dev/md/ddf0, slot 3.
mdadm: /dev/sdd is identified as a member of /dev/md/ddf0, slot 2.
mdadm: /dev/sdc is identified as a member of /dev/md/ddf0, slot 1.
mdadm: /dev/sdb is identified as a member of /dev/md/ddf0, slot 0.
mdadm: ignoring /dev/sdb as it reports /dev/sdf as failed
mdadm: ignoring /dev/sdc as it reports /dev/sdf as failed
mdadm: ignoring /dev/sdd as it reports /dev/sdf as failed
mdadm: ignoring /dev/sde as it reports /dev/sdf as failed
mdadm: no uptodate device for slot 0 of /dev/md/ddf0
mdadm: no uptodate device for slot 2 of /dev/md/ddf0
mdadm: no uptodate device for slot 4 of /dev/md/ddf0
mdadm: no uptodate device for slot 6 of /dev/md/ddf0
mdadm: added /dev/sdf to /dev/md/ddf0 as 4
mdadm: Container /dev/md/ddf0 has been assembled with 1 drive (out of 5)
mdadm: looking for devices for /dev/md/MegaSR
mdadm: looking in container /dev/md/ddf0
mdadm: no recogniseable superblock on /dev/sr0
mdadm: /dev/sdf has wrong uuid.
mdadm: no recogniseable superblock on /dev/sde2
mdadm: no recogniseable superblock on /dev/sde1
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: no recogniseable superblock on /dev/sdb2
mdadm: no recogniseable superblock on /dev/sdb1
mdadm: /dev/sdb has wrong uuid.
mdadm: Cannot assemble mbr metadata on /dev/sda1
mdadm: Cannot assemble mbr metadata on /dev/sda
mdadm: looking for devices for /dev/md/MegaSR
mdadm: looking in container /dev/md/ddf0
mdadm: no recogniseable superblock on /dev/sr0
mdadm: /dev/sdf has wrong uuid.
mdadm: no recogniseable superblock on /dev/sde2
mdadm: no recogniseable superblock on /dev/sde1
mdadm: /dev/sde has wrong uuid.
mdadm: /dev/sdd has wrong uuid.
mdadm: /dev/sdc has wrong uuid.
mdadm: no recogniseable superblock on /dev/sdb2
mdadm: no recogniseable superblock on /dev/sdb1
mdadm: /dev/sdb has wrong uuid.
mdadm: Cannot assemble mbr metadata on /dev/sda1
mdadm: Cannot assemble mbr metadata on /dev/sda
Output of 'dmesg | grep md:'
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: multipath personality registered for level -4
md: md127 stopped.
md: bind<sdf>

Contents of /dev/mapper directory:
crw------T    1 root     root       10, 236 Nov 20 15:05 control

startraid mode = null (default)

Contents of /tbu/utility/dmraid-boot.txt file (from activate command on boot):
NOTICE: checking format identifier asr
NOTICE: checking format identifier hpt37x
NOTICE: checking format identifier hpt45x
NOTICE: checking format identifier jmicron
NOTICE: checking format identifier lsi
NOTICE: checking format identifier nvidia
NOTICE: checking format identifier pdc
NOTICE: checking format identifier sil
NOTICE: checking format identifier via
NOTICE: checking format identifier dos
WARN: locking /var/lock/dmraid/.lock
NOTICE: /dev/sdf: asr     discovering
NOTICE: /dev/sdf: hpt37x  discovering
NOTICE: /dev/sdf: hpt45x  discovering
NOTICE: /dev/sdf: jmicron discovering
NOTICE: /dev/sdf: lsi     discovering
NOTICE: /dev/sdf: nvidia  discovering
NOTICE: /dev/sdf: pdc     discovering
NOTICE: /dev/sdf: sil     discovering
NOTICE: /dev/sdf: via     discovering
NOTICE: /dev/sde: asr     discovering
NOTICE: /dev/sde: hpt37x  discovering
NOTICE: /dev/sde: hpt45x  discovering
NOTICE: /dev/sde: jmicron discovering
NOTICE: /dev/sde: lsi     discovering
NOTICE: /dev/sde: nvidia  discovering
NOTICE: /dev/sde: pdc     discovering
NOTICE: /dev/sde: sil     discovering
NOTICE: /dev/sde: via     discovering
NOTICE: /dev/sdd: asr     discovering
NOTICE: /dev/sdd: hpt37x  discovering
NOTICE: /dev/sdd: hpt45x  discovering
NOTICE: /dev/sdd: jmicron discovering
NOTICE: /dev/sdd: lsi     discovering
NOTICE: /dev/sdd: nvidia  discovering
NOTICE: /dev/sdd: pdc     discovering
NOTICE: /dev/sdd: sil     discovering
NOTICE: /dev/sdd: via     discovering
NOTICE: /dev/sdc: asr     discovering
NOTICE: /dev/sdc: hpt37x  discovering
NOTICE: /dev/sdc: hpt45x  discovering
NOTICE: /dev/sdc: jmicron discovering
NOTICE: /dev/sdc: lsi     discovering
NOTICE: /dev/sdc: nvidia  discovering
NOTICE: /dev/sdc: pdc     discovering
NOTICE: /dev/sdc: sil     discovering
NOTICE: /dev/sdc: via     discovering
NOTICE: /dev/sdb: asr     discovering
NOTICE: /dev/sdb: hpt37x  discovering
NOTICE: /dev/sdb: hpt45x  discovering
NOTICE: /dev/sdb: jmicron discovering
NOTICE: /dev/sdb: lsi     discovering
NOTICE: /dev/sdb: nvidia  discovering
NOTICE: /dev/sdb: pdc     discovering
NOTICE: /dev/sdb: sil     discovering
NOTICE: /dev/sdb: via     discovering
NOTICE: /dev/sda: asr     discovering
NOTICE: /dev/sda: hpt37x  discovering
NOTICE: /dev/sda: hpt45x  discovering
NOTICE: /dev/sda: jmicron discovering
NOTICE: /dev/sda: lsi     discovering
NOTICE: /dev/sda: nvidia  discovering
NOTICE: /dev/sda: pdc     discovering
NOTICE: /dev/sda: sil     discovering
NOTICE: /dev/sda: via     discovering
no raid disks with format:
"asr,hpt37x,hpt45x,jmicron,lsi,nvidia,pdc,sil,via,dos"
WARN: unlocking /var/lock/dmraid/.lock

Output of dmraid --raid_devices command:
/dev/sdf: ddf1, ".ddf1_disks", GROUP, unknown, 285155328 sectors, data@ 0
/dev/sde: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdd: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdc: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdb: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0

Output of dmraid -s -s command:
*** Group superset .ddf1_disks
--> Subset
name   : ddf1_MegaSR   R5 #0
size   : 855465984
stride : 128
type   : raid5_ls
status : ok
subsets: 0
devs   : 4
spares : 0

Output of blkid command:
/dev/sda1: LABEL="Elements" UUID="DA1CE6D71CE6AE27" TYPE="ntfs"
/dev/sdc: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sdd: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sdf: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sdg: LABEL="CS SUPPORT" UUID="4CC3-3665" TYPE="vfat"
/dev/sr0: LABEL="iflnet" TYPE="iso9660"

Output of free command (memory):

             total       used       free     shared    buffers     cached
Mem:       2039280     185052    1854228          0      40996      77220
-/+ buffers/cache:      66836    1972444
Swap:            0          0          0

Output of lsmod command (loaded modules):

Module                  Size  Used by    Not tainted
sr_mod                 10387  0
cdrom                  24577  1 sr_mod
sg                     17182  0
sd_mod                 25922  1
crc_t10dif               996  1 sd_mod
usb_storage            32098  0
hid_generic              665  0
usbhid                 18838  0
hid                    60174  2 hid_generic,usbhid
pcspkr                  1227  0
isci                   71218  1
i2c_i801                7257  0
libsas                 47529  1 isci
scsi_transport_sas     16455  2 isci,libsas
igb                    98565  0
i2c_algo_bit            3747  1 igb
i2c_core               13114  3 i2c_i801,igb,i2c_algo_bit
ehci_pci                2540  0
ptp                     5420  1 igb
pps_core                4544  1 ptp
processor              14951  8
thermal_sys            12739  1 processor
hwmon                    881  2 igb,thermal_sys
button                  3413  0
dm_mod                 50912  0
ehci_hcd               28628  1 ehci_pci
edd                     5144  0

Contents of /sys/module:
8250                i2c_algo_bit        ptp                 speakup_dtlk
acpi                i2c_core            raid1               speakup_dummy
auth_rpcgss         i2c_i801            raid10              speakup_keypc
block               i8042               random              speakup_ltlk
brd                 igb                 rcupdate            speakup_soft
button              input_polldev       rcutree             speakup_spkout
cdrom               isci                rfkill              speakup_txprt
cifs                kernel              scsi_mod            spurious
cpuidle             keyboard            scsi_transport_fc   sr_mod
crc_t10dif          libata              scsi_transport_sas  sunrpc
dm_mod              libsas              sd_mod              tcp_cubic
dns_resolver        lockd               sg                  thermal_sys
edd                 md_mod              speakup             usb_storage
ehci_hcd            mousedev            speakup_acntpc      usbcore
ehci_pci            nfs                 speakup_acntsa      usbhid
firmware_class      pcie_aspm           speakup_apollo      vt
fuse                pcspkr              speakup_audptr      workqueue
hid                 pps_core            speakup_bns         xz_dec
hid_generic         printk              speakup_decext
hwmon               processor           speakup_dectlk

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20  2:30                     ` NeilBrown
  2013-11-20  6:41                       ` David F.
@ 2013-11-21 20:46                       ` Martin Wilck
  2013-11-21 21:06                         ` David F.
  2013-11-21 23:05                         ` David F.
  1 sibling, 2 replies; 44+ messages in thread
From: Martin Wilck @ 2013-11-21 20:46 UTC (permalink / raw)
  To: NeilBrown; +Cc: David F., linux-raid

On 11/20/2013 03:30 AM, NeilBrown wrote:

>>>> mdadm --examine --scan
>>> ARRAY metadata=ddf UUID=7ab254d0:fae71048:
>>> 404edde9:750a8a05
>>> ARRAY container=7ab254d0:fae71048:404edde9:750a8a05 member=0
>>> UUID=5337ab03:86ca2abc:d42bfbc8:23626c78
> 
> This shows that mdadm found a container with the correct UUID, but the member
> array inside the container has the wrong uuid.
> 
> Martin: I think one of your recent changes would have changed the member UUID
> for some specific arrays because the one that was being created before wasn't
> reliably stable.  Could  that apply to David's situation?

I am confused. AFAIL, my patch bedbf68a first introduced subarray UUIDs
for DDF. I don't understand how this mdadm.conf could have worked with
mdadm 3.2.x.

But you are right, I had to make 7087f02b later that changed the way
subarray UUIDs were calculated. This would hurt people who created their
mdadm.conf file) with stock 3.3 and updated to latest git later.

> David: if you remove the "UUID=" part for the array leaving the
> "container=.... member=0" as the identification, does it work?

I second that. David, please try it. I'd also appreciate "mdadm -E
/dev/sdX" output for all the RAID disks.

Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-20 23:15                         ` David F.
@ 2013-11-21 20:50                           ` Martin Wilck
  2013-11-21 21:10                             ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: Martin Wilck @ 2013-11-21 20:50 UTC (permalink / raw)
  To: David F.; +Cc: NeilBrown, linux-raid

On 11/21/2013 12:15 AM, David F. wrote:

> mdadm: /dev/sdf is identified as a member of /dev/md/ddf0, slot 4.
> mdadm: /dev/sde is identified as a member of /dev/md/ddf0, slot 3.
> mdadm: /dev/sdd is identified as a member of /dev/md/ddf0, slot 2.
> mdadm: /dev/sdc is identified as a member of /dev/md/ddf0, slot 1.
> mdadm: /dev/sdb is identified as a member of /dev/md/ddf0, slot 0.
> mdadm: ignoring /dev/sdb as it reports /dev/sdf as failed
> mdadm: ignoring /dev/sdc as it reports /dev/sdf as failed
> mdadm: ignoring /dev/sdd as it reports /dev/sdf as failed
> mdadm: ignoring /dev/sde as it reports /dev/sdf as failed
> mdadm: no uptodate device for slot 0 of /dev/md/ddf0
> mdadm: no uptodate device for slot 2 of /dev/md/ddf0
> mdadm: no uptodate device for slot 4 of /dev/md/ddf0
> mdadm: no uptodate device for slot 6 of /dev/md/ddf0
> mdadm: added /dev/sdf to /dev/md/ddf0 as 4
> mdadm: Container /dev/md/ddf0 has been assembled with 1 drive (out of 5)

That looks really weird. The (healthy?) devices sdb-sde are ignored
because they report sdf as failed, and then sdf is used for assembly?

I have no idea at the moment, I need to read the code.

> Output of dmraid --raid_devices command:
> /dev/sdf: ddf1, ".ddf1_disks", GROUP, unknown, 285155328 sectors, data@ 0
> /dev/sde: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
> /dev/sdd: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
> /dev/sdc: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
> /dev/sdb: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0

This seems to support then notion that something's wrong with /dev/sdf.

Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 20:46                       ` Martin Wilck
@ 2013-11-21 21:06                         ` David F.
  2013-11-21 23:05                         ` David F.
  1 sibling, 0 replies; 44+ messages in thread
From: David F. @ 2013-11-21 21:06 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

>> Martin: I think one of your recent changes would have changed the member UUID
>> for some specific arrays because the one that was being created before wasn't
>> reliably stable.  Could  that apply to David's situation?
>
> I am confused. AFAIL, my patch bedbf68a first introduced subarray UUIDs
> for DDF. I don't understand how this mdadm.conf could have worked with
> mdadm 3.2.x.

I'm not sure on the cisco server with lsi raid that the 3.2.x version
works as that is different than the isw issues that most were having.

>
> But you are right, I had to make 7087f02b later that changed the way
> subarray UUIDs were calculated. This would hurt people who created their
> mdadm.conf file) with stock 3.3 and updated to latest git later.
>
>> David: if you remove the "UUID=" part for the array leaving the
>> "container=.... member=0" as the identification, does it work?

We sent them a version that will try that - hope they don't get too
tired of testing.  The int13h interface to the RAID works fine, as
does Windows interface.

>
> I second that. David, please try it. I'd also appreciate "mdadm -E
> /dev/sdX" output for all the RAID disks.

That version we sent should output this as well.

>
> Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 20:50                           ` Martin Wilck
@ 2013-11-21 21:10                             ` David F.
  2013-11-21 21:30                               ` Martin Wilck
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-21 21:10 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

On that DMRAID - they are still yet to try it.  But we do know the
RAID5 works via Int13h interface in real mode and via Windows.  I
think they thought it was a 4 disk array?  I'll ask if they know the
actually number of drives in the RAID configuration.

On Thu, Nov 21, 2013 at 12:50 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 11/21/2013 12:15 AM, David F. wrote:
>
>> mdadm: /dev/sdf is identified as a member of /dev/md/ddf0, slot 4.
>> mdadm: /dev/sde is identified as a member of /dev/md/ddf0, slot 3.
>> mdadm: /dev/sdd is identified as a member of /dev/md/ddf0, slot 2.
>> mdadm: /dev/sdc is identified as a member of /dev/md/ddf0, slot 1.
>> mdadm: /dev/sdb is identified as a member of /dev/md/ddf0, slot 0.
>> mdadm: ignoring /dev/sdb as it reports /dev/sdf as failed
>> mdadm: ignoring /dev/sdc as it reports /dev/sdf as failed
>> mdadm: ignoring /dev/sdd as it reports /dev/sdf as failed
>> mdadm: ignoring /dev/sde as it reports /dev/sdf as failed
>> mdadm: no uptodate device for slot 0 of /dev/md/ddf0
>> mdadm: no uptodate device for slot 2 of /dev/md/ddf0
>> mdadm: no uptodate device for slot 4 of /dev/md/ddf0
>> mdadm: no uptodate device for slot 6 of /dev/md/ddf0
>> mdadm: added /dev/sdf to /dev/md/ddf0 as 4
>> mdadm: Container /dev/md/ddf0 has been assembled with 1 drive (out of 5)
>
> That looks really weird. The (healthy?) devices sdb-sde are ignored
> because they report sdf as failed, and then sdf is used for assembly?
>
> I have no idea at the moment, I need to read the code.
>
>> Output of dmraid --raid_devices command:
>> /dev/sdf: ddf1, ".ddf1_disks", GROUP, unknown, 285155328 sectors, data@ 0
>> /dev/sde: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
>> /dev/sdd: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
>> /dev/sdc: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
>> /dev/sdb: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
>
> This seems to support then notion that something's wrong with /dev/sdf.
>
> Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 21:10                             ` David F.
@ 2013-11-21 21:30                               ` Martin Wilck
  2013-11-21 22:39                                 ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: Martin Wilck @ 2013-11-21 21:30 UTC (permalink / raw)
  To: David F.; +Cc: NeilBrown, linux-raid

On 11/21/2013 10:10 PM, David F. wrote:
> On that DMRAID - they are still yet to try it.  But we do know the
> RAID5 works via Int13h interface in real mode and via Windows.  I
> think they thought it was a 4 disk array?  I'll ask if they know the
> actually number of drives in the RAID configuration.

What distribution are these people using? I am not aware of any distro
that would activate mdadm for DDF RAID by default.

Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 21:30                               ` Martin Wilck
@ 2013-11-21 22:39                                 ` David F.
  2013-11-25 21:39                                   ` Martin Wilck
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-21 22:39 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

Are you saying that the old obsolete DMRAID should still be used for
DDF RAID?   What about the 2TiB limit?  I'd rather see modern linux
RAID support work as good as it does for Windows.

On Thu, Nov 21, 2013 at 1:30 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 11/21/2013 10:10 PM, David F. wrote:
>> On that DMRAID - they are still yet to try it.  But we do know the
>> RAID5 works via Int13h interface in real mode and via Windows.  I
>> think they thought it was a 4 disk array?  I'll ask if they know the
>> actually number of drives in the RAID configuration.
>
> What distribution are these people using? I am not aware of any distro
> that would activate mdadm for DDF RAID by default.
>
> Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 20:46                       ` Martin Wilck
  2013-11-21 21:06                         ` David F.
@ 2013-11-21 23:05                         ` David F.
  2013-11-21 23:09                           ` David F.
  2013-11-25 21:56                           ` Martin Wilck
  1 sibling, 2 replies; 44+ messages in thread
From: David F. @ 2013-11-21 23:05 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

They say it still didn't work (we got rid of the member uuid) - but
here's is more detailed stuff you asked for:

Storage devices (non-RAID) detected by Linux based on /sys/block:

Optical drives (srx):
  sr0: hp       DVDRAM GT30L     (fw rev = mP06)
Removable drives (sdx):
  sdg: 7453 MiB          Patriot Memory
Fixed hard drives (sdx):
  sda: 1863 GiB WD       Ext HDD 1021
  sdb: 136 GiB TOSHIBA  MK1401GRRB
  sdc: 136 GiB TOSHIBA  MK1401GRRB
  sdd: 136 GiB TOSHIBA  MK1401GRRB
  sde: 136 GiB TOSHIBA  MK1401GRRB
  sdf: 136 GiB TOSHIBA  MK1401GRRB

Contents of /sys/block:
md127  ram0   ram1   sda    sdb    sdc    sdd    sde    sdf    sdg    sr0

Contents of /dev:
block               port                tty1                tty47
bsg                 ppp                 tty10               tty48
bus                 psaux               tty11               tty49
cdrom               ptmx                tty12               tty5
cdrom1              ptp0                tty13               tty50
cdrw                ptp1                tty14               tty51
cdrw1               pts                 tty15               tty52
char                ram0                tty16               tty53
console             ram1                tty17               tty54
core                random              tty18               tty55
cpu_dma_latency     rfkill              tty19               tty56
disk                rtc                 tty2                tty57
dvd                 rtc0                tty20               tty58
dvd1                sda                 tty21               tty59
dvdrw               sda1                tty22               tty6
dvdrw1              sdb                 tty23               tty60
fd                  sdb1                tty24               tty61
full                sdb2                tty25               tty62
fuse                sdc                 tty26               tty63
hidraw0             sdd                 tty27               tty7
hidraw1             sde                 tty28               tty8
hidraw2             sde1                tty29               tty9
hidraw3             sde2                tty3                ttyS0
hidraw4             sdf                 tty30               ttyS1
hidraw5             sdg                 tty31               ttyS2
hpet                sg0                 tty32               ttyS3
input               sg1                 tty33               urandom
kmsg                sg2                 tty34               vcs
log                 sg3                 tty35               vcs1
loop-control        sg4                 tty36               vcs2
loop0               sg5                 tty37               vcs3
loop1               sg6                 tty38               vcs4
mapper              sg7                 tty39               vcs5
mcelog              shm                 tty4                vcsa
md                  sr0                 tty40               vcsa1
md127               stderr              tty41               vcsa2
mem                 stdin               tty42               vcsa3
net                 stdout              tty43               vcsa4
network_latency     synth               tty44               vcsa5
network_throughput  tty                 tty45               vga_arbiter
null                tty0                tty46               zero

Contents of /proc/partitions:
major minor  #blocks  name

   8        0 1953512448 sda
   8        1 1953511424 sda1
   8       16  143638992 sdb
   8       17     102400 sdb1
   8       18  143535568 sdb2
   8       32  143638992 sdc
   8       48  143638992 sdd
   8       64  143638992 sde
   8       65     102400 sde1
   8       66  143535568 sde2
   8       80  143638992 sdf
  11        0      48608 sr0
   8       96    7632892 sdg

Disk /dev/sda: 2000.3 GB, 2000396746752 bytes
255 heads, 63 sectors/track, 243201 cylinders, total 3907024896 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sda1            2048  3907024895  1953511424   7 HPFS/NTFS

Disk /dev/sdb: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdb1   *        2048      206847      102400   7 HPFS/NTFS
Partition 1 does not end on cylinder boundary
/dev/sdb2          206848   855463935   427628544   7 HPFS/NTFS

Disk /dev/sdc: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdc doesn't contain a valid partition table

Disk /dev/sdd: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdd doesn't contain a valid partition table

Disk /dev/sde: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sde1   *        2048      206847      102400   7 HPFS/NTFS
Partition 1 does not end on cylinder boundary
/dev/sde2          206848   855463935   427628544   7 HPFS/NTFS

Disk /dev/sdf: 147.0 GB, 147086327808 bytes
255 heads, 63 sectors/track, 17882 cylinders, total 287277984 sectors
Units = sectors of 1 * 512 = 512 bytes

Disk /dev/sdf doesn't contain a valid partition table

Disk /dev/sdg: 7816 MB, 7816081408 bytes
241 heads, 62 sectors/track, 1021 cylinders, total 15265784 sectors
Units = sectors of 1 * 512 = 512 bytes

   Device Boot      Start         End      Blocks  Id System
/dev/sdg1   ?   778135908  1919645538   570754815+ 72 Unknown
Partition 1 has different physical/logical beginnings (non-Linux?):
     phys=(357, 116, 40) logical=(52077, 22, 11)
Partition 1 has different physical/logical endings:
     phys=(357, 32, 45) logical=(128473, 31, 51)
Partition 1 does not end on cylinder boundary
/dev/sdg2   ?   168689522  2104717761   968014120  65 Unknown
Partition 2 has different physical/logical beginnings (non-Linux?):
     phys=(288, 115, 43) logical=(11289, 149, 47)
Partition 2 has different physical/logical endings:
     phys=(367, 114, 50) logical=(140859, 41, 42)
Partition 2 does not end on cylinder boundary
/dev/sdg3   ?  1869881465  3805909656   968014096  79 Unknown
Partition 3 has different physical/logical beginnings (non-Linux?):
     phys=(366, 32, 33) logical=(125142, 156, 30)
Partition 3 has different physical/logical endings:
     phys=(357, 32, 43) logical=(254712, 47, 39)
Partition 3 does not end on cylinder boundary
/dev/sdg4   ?  2885681152  2885736650       27749+  d Unknown
Partition 4 has different physical/logical beginnings (non-Linux?):
     phys=(372, 97, 50) logical=(193125, 119, 25)
Partition 4 has different physical/logical endings:
     phys=(0, 10, 0) logical=(193129, 50, 33)
Partition 4 does not end on cylinder boundary

Partition table entries are not in disk order

Contents of /proc/mdstat (Linux software RAID status):
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4] [multipath]
md127 : inactive sdf[0](S)
      1061328 blocks super external:ddf

unused devices: <none>

Contents of /run/mdadm/map (Linux software RAID arrays):
md127 ddf 7ab254d0:fae71048:404edde9:750a8a05 /dev/md/ddf0

Contents of /etc/mdadm/mdadm.conf (Linux software RAID config file):
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
ARRAY /dev/md/MegaSR container=7ab254d0:fae71048:404edde9:750a8a05 member=0

Long listing of /dev/md directory and /dev/md* device files (Linux
software RAID devices):
brw-rw---T    1 root     disk        9, 127 Nov 21 13:49 /dev/md127

/dev/md:
lrwxrwxrwx    1 root     root             8 Nov 21 13:49 ddf0 -> ../md127

Contents of /tbu/utility/mdadm.txt (mdadm troubleshooting data
captured when 'start-md' is executed):
mdadm - v3.3-32-g357ac10 - 20th November 2013
Output of 'mdadm --examine --scan'
ARRAY metadata=ddf UUID=7ab254d0:fae71048:404edde9:750a8a05
ARRAY /dev/md/MegaSR container=7ab254d0:fae71048:404edde9:750a8a05
member=0 UUID=9515a157:e69d2ad3:8427131f:92b29c7a
Output of 'mdadm --assemble --scan --no-degraded -v'
mdadm: looking for devices for further assembly
mdadm: no RAID superblock on /dev/sr0
mdadm: no RAID superblock on /dev/sde2
mdadm: no RAID superblock on /dev/sde1
mdadm: no RAID superblock on /dev/sdb2
mdadm: no RAID superblock on /dev/sdb1
mdadm: no RAID superblock on /dev/sda1
mdadm: no RAID superblock on /dev/sda
mdadm: /dev/sdf is identified as a member of /dev/md/ddf0, slot 4.
mdadm: /dev/sde is identified as a member of /dev/md/ddf0, slot 3.
mdadm: /dev/sdd is identified as a member of /dev/md/ddf0, slot 2.
mdadm: /dev/sdc is identified as a member of /dev/md/ddf0, slot 1.
mdadm: /dev/sdb is identified as a member of /dev/md/ddf0, slot 0.
mdadm: ignoring /dev/sdb as it reports /dev/sdf as failed
mdadm: ignoring /dev/sdc as it reports /dev/sdf as failed
mdadm: ignoring /dev/sdd as it reports /dev/sdf as failed
mdadm: ignoring /dev/sde as it reports /dev/sdf as failed
mdadm: no uptodate device for slot 0 of /dev/md/ddf0
mdadm: no uptodate device for slot 2 of /dev/md/ddf0
mdadm: no uptodate device for slot 4 of /dev/md/ddf0
mdadm: no uptodate device for slot 6 of /dev/md/ddf0
mdadm: added /dev/sdf to /dev/md/ddf0 as 4
mdadm: Container /dev/md/ddf0 has been assembled with 1 drive (out of 5)
mdadm: looking for devices for /dev/md/MegaSR
mdadm: looking in container /dev/md/ddf0
mdadm: no recogniseable superblock on /dev/sr0
mdadm: /dev/sdf is not a container and one is required.
mdadm: no recogniseable superblock on /dev/sde2
mdadm: no recogniseable superblock on /dev/sde1
mdadm: /dev/sde is not a container and one is required.
mdadm: /dev/sdd is not a container and one is required.
mdadm: /dev/sdc is not a container and one is required.
mdadm: no recogniseable superblock on /dev/sdb2
mdadm: no recogniseable superblock on /dev/sdb1
mdadm: /dev/sdb is not a container and one is required.
mdadm: Cannot assemble mbr metadata on /dev/sda1
mdadm: Cannot assemble mbr metadata on /dev/sda
mdadm: looking for devices for /dev/md/MegaSR
mdadm: looking in container /dev/md/ddf0
mdadm: no recogniseable superblock on /dev/sr0
mdadm: /dev/sdf is not a container and one is required.
mdadm: no recogniseable superblock on /dev/sde2
mdadm: no recogniseable superblock on /dev/sde1
mdadm: /dev/sde is not a container and one is required.
mdadm: /dev/sdd is not a container and one is required.
mdadm: /dev/sdc is not a container and one is required.
mdadm: no recogniseable superblock on /dev/sdb2
mdadm: no recogniseable superblock on /dev/sdb1
mdadm: /dev/sdb is not a container and one is required.
mdadm: Cannot assemble mbr metadata on /dev/sda1
mdadm: Cannot assemble mbr metadata on /dev/sda
Output of 'dmesg | grep md:'
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
md: multipath personality registered for level -4
md: md127 stopped.
md: bind<sdf>


Output of 'mdadm -E /dev/sda'
/dev/sda:
   MBR Magic : aa55
Partition[0] :   3907022848 sectors at         2048 (type 07)


Output of 'mdadm -E /dev/sdb'
/dev/sdb:
          Magic : de11de11
        Version : 01.00.00
Controller GUID : 4C534920:20202020:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF
                  (LSI     )
 Container GUID : 4C534920:20202020:80861D68:113700B2:3FB602B2:926C6F1A
                  (LSI      11/14/13 10:40:50)
            Seq : 000000de
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C534920:20202020:80861D60:00000000:3FBF690F:00001450
                  (LSI      11/21/13 13:47:59)
         unit[0] : 0
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : MegaSR   R5 #0
 Raid Devices[0] : 4 (0 1 2 3)
   Chunk Size[0] : 128 sectors
   Raid Level[0] : RAID5
  Device Size[0] : 142577664
   Array Size[0] : 427732992

 Physical Disks : 5
      Number    RefNo      Size       Device      Type/State
         0    ffffe3be  142577664K /dev/sdb        active/Online
         1    ffbf7ff4  142577664K                 active/Online
         2    ffffd801  142577664K                 active/Online
         3    bff6febe  142577664K                 active/Online
         4    bfffcf03  142577664K                 activeGlobal-Spare/Offline


Output of 'mdadm -E /dev/sdc'
/dev/sdc:
          Magic : de11de11
        Version : 01.00.00
Controller GUID : 4C534920:20202020:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF
                  (LSI     )
 Container GUID : 4C534920:20202020:80861D68:113700B2:3FB602B2:926C6F1A
                  (LSI      11/14/13 10:40:50)
            Seq : 000000de
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C534920:20202020:80861D60:00000000:3FBF690F:00001450
                  (LSI      11/21/13 13:47:59)
         unit[0] : 0
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : MegaSR   R5 #0
 Raid Devices[0] : 4 (0 1 2 3)
   Chunk Size[0] : 128 sectors
   Raid Level[0] : RAID5
  Device Size[0] : 142577664
   Array Size[0] : 427732992

 Physical Disks : 5
      Number    RefNo      Size       Device      Type/State
         0    ffffe3be  142577664K                 active/Online
         1    ffbf7ff4  142577664K /dev/sdc        active/Online
         2    ffffd801  142577664K                 active/Online
         3    bff6febe  142577664K                 active/Online
         4    bfffcf03  142577664K                 activeGlobal-Spare/Offline


Output of 'mdadm -E /dev/sdd'
/dev/sdd:
          Magic : de11de11
        Version : 01.00.00
Controller GUID : 4C534920:20202020:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF
                  (LSI     )
 Container GUID : 4C534920:20202020:80861D68:113700B2:3FB602B2:926C6F1A
                  (LSI      11/14/13 10:40:50)
            Seq : 000000de
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C534920:20202020:80861D60:00000000:3FBF690F:00001450
                  (LSI      11/21/13 13:47:59)
         unit[0] : 0
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : MegaSR   R5 #0
 Raid Devices[0] : 4 (0 1 2 3)
   Chunk Size[0] : 128 sectors
   Raid Level[0] : RAID5
  Device Size[0] : 142577664
   Array Size[0] : 427732992

 Physical Disks : 5
      Number    RefNo      Size       Device      Type/State
         0    ffffe3be  142577664K                 active/Online
         1    ffbf7ff4  142577664K                 active/Online
         2    ffffd801  142577664K /dev/sdd        active/Online
         3    bff6febe  142577664K                 active/Online
         4    bfffcf03  142577664K                 activeGlobal-Spare/Offline


Output of 'mdadm -E /dev/sde'
/dev/sde:
          Magic : de11de11
        Version : 01.00.00
Controller GUID : 4C534920:20202020:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF
                  (LSI     )
 Container GUID : 4C534920:20202020:80861D68:113700B2:3FB602B2:926C6F1A
                  (LSI      11/14/13 10:40:50)
            Seq : 000000de
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C534920:20202020:80861D60:00000000:3FBF690F:00001450
                  (LSI      11/21/13 13:47:59)
         unit[0] : 0
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : MegaSR   R5 #0
 Raid Devices[0] : 4 (0 1 2 3)
   Chunk Size[0] : 128 sectors
   Raid Level[0] : RAID5
  Device Size[0] : 142577664
   Array Size[0] : 427732992

 Physical Disks : 5
      Number    RefNo      Size       Device      Type/State
         0    ffffe3be  142577664K                 active/Online
         1    ffbf7ff4  142577664K                 active/Online
         2    ffffd801  142577664K                 active/Online
         3    bff6febe  142577664K /dev/sde        active/Online
         4    bfffcf03  142577664K                 activeGlobal-Spare/Offline


Output of 'mdadm -E /dev/sdf'
/dev/sdf:
          Magic : de11de11
        Version : 01.00.00
Controller GUID : 4C534920:20202020:FFFFFFFF:FFFFFFFF:FFFFFFFF:FFFFFFFF
                  (LSI     )
 Container GUID : 4C534920:20202020:80861D68:113700B2:3FB602B2:926C6F1A
                  (LSI      11/14/13 10:40:50)
            Seq : 000000de
  Redundant hdr : yes
  Virtual Disks : 1

      VD GUID[0] : 4C534920:20202020:80861D60:00000000:3FBF690F:00001450
                  (LSI      11/21/13 13:47:59)
         unit[0] : 0
        state[0] : Optimal, Consistent
   init state[0] : Fully Initialised
       access[0] : Read/Write
         Name[0] : MegaSR   R5 #0

 Physical Disks : 5
      Number    RefNo      Size       Device      Type/State
         0    ffffe3be  142577664K                 active/Online
         1    ffbf7ff4  142577664K                 active/Online
         2    ffffd801  142577664K                 active/Online
         3    bff6febe  142577664K                 active/Online
         4    bfffcf03  142577664K /dev/sdf        activeGlobal-Spare/Offline


Output of 'mdadm -E /dev/sdg'
/dev/sdg:
   MBR Magic : aa55
Partition[0] :   1141509631 sectors at    778135908 (type 72)
Partition[1] :   1936028240 sectors at    168689522 (type 65)
Partition[2] :   1936028192 sectors at   1869881465 (type 79)
Partition[3] :        55499 sectors at   2885681152 (type 0d)

Contents of /dev/mapper directory:
crw------T    1 root     root       10, 236 Nov 21 13:48 control

startraid mode = null (default)

Contents of /tbu/utility/dmraid-boot.txt file (from activate command on boot):
NOTICE: checking format identifier asr
NOTICE: checking format identifier hpt37x
NOTICE: checking format identifier hpt45x
NOTICE: checking format identifier jmicron
NOTICE: checking format identifier lsi
NOTICE: checking format identifier nvidia
NOTICE: checking format identifier pdc
NOTICE: checking format identifier sil
NOTICE: checking format identifier via
NOTICE: checking format identifier dos
WARN: locking /var/lock/dmraid/.lock
NOTICE: /dev/sdf: asr     discovering
NOTICE: /dev/sdf: hpt37x  discovering
NOTICE: /dev/sdf: hpt45x  discovering
NOTICE: /dev/sdf: jmicron discovering
NOTICE: /dev/sdf: lsi     discovering
NOTICE: /dev/sdf: nvidia  discovering
NOTICE: /dev/sdf: pdc     discovering
NOTICE: /dev/sdf: sil     discovering
NOTICE: /dev/sdf: via     discovering
NOTICE: /dev/sde: asr     discovering
NOTICE: /dev/sde: hpt37x  discovering
NOTICE: /dev/sde: hpt45x  discovering
NOTICE: /dev/sde: jmicron discovering
NOTICE: /dev/sde: lsi     discovering
NOTICE: /dev/sde: nvidia  discovering
NOTICE: /dev/sde: pdc     discovering
NOTICE: /dev/sde: sil     discovering
NOTICE: /dev/sde: via     discovering
NOTICE: /dev/sdd: asr     discovering
NOTICE: /dev/sdd: hpt37x  discovering
NOTICE: /dev/sdd: hpt45x  discovering
NOTICE: /dev/sdd: jmicron discovering
NOTICE: /dev/sdd: lsi     discovering
NOTICE: /dev/sdd: nvidia  discovering
NOTICE: /dev/sdd: pdc     discovering
NOTICE: /dev/sdd: sil     discovering
NOTICE: /dev/sdd: via     discovering
NOTICE: /dev/sdc: asr     discovering
NOTICE: /dev/sdc: hpt37x  discovering
NOTICE: /dev/sdc: hpt45x  discovering
NOTICE: /dev/sdc: jmicron discovering
NOTICE: /dev/sdc: lsi     discovering
NOTICE: /dev/sdc: nvidia  discovering
NOTICE: /dev/sdc: pdc     discovering
NOTICE: /dev/sdc: sil     discovering
NOTICE: /dev/sdc: via     discovering
NOTICE: /dev/sdb: asr     discovering
NOTICE: /dev/sdb: hpt37x  discovering
NOTICE: /dev/sdb: hpt45x  discovering
NOTICE: /dev/sdb: jmicron discovering
NOTICE: /dev/sdb: lsi     discovering
NOTICE: /dev/sdb: nvidia  discovering
NOTICE: /dev/sdb: pdc     discovering
NOTICE: /dev/sdb: sil     discovering
NOTICE: /dev/sdb: via     discovering
NOTICE: /dev/sda: asr     discovering
NOTICE: /dev/sda: hpt37x  discovering
NOTICE: /dev/sda: hpt45x  discovering
NOTICE: /dev/sda: jmicron discovering
NOTICE: /dev/sda: lsi     discovering
NOTICE: /dev/sda: nvidia  discovering
NOTICE: /dev/sda: pdc     discovering
NOTICE: /dev/sda: sil     discovering
NOTICE: /dev/sda: via     discovering
no raid disks with format:
"asr,hpt37x,hpt45x,jmicron,lsi,nvidia,pdc,sil,via,dos"
WARN: unlocking /var/lock/dmraid/.lock

Output of dmraid --raid_devices command:
/dev/sdf: ddf1, ".ddf1_disks", GROUP, unknown, 285155328 sectors, data@ 0
/dev/sde: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdd: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdc: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0
/dev/sdb: ddf1, ".ddf1_disks", GROUP, ok, 285155328 sectors, data@ 0

Output of dmraid -s -s command:
*** Group superset .ddf1_disks
--> Subset
name   : ddf1_MegaSR   R5 #0
size   : 855465984
stride : 128
type   : raid5_ls
status : ok
subsets: 0
devs   : 4
spares : 0

Output of blkid command:
/dev/sda1: LABEL="Elements" UUID="DA1CE6D71CE6AE27" TYPE="ntfs"
/dev/sdc: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sdd: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sr0: LABEL="iflnet" TYPE="iso9660"
/dev/sdf: UUID="LSI     M-^@M-^F^]h^Q7" TYPE="ddf_raid_member"
/dev/sdg: LABEL="CS SUPPORT" UUID="4CC3-3665" TYPE="vfat"

Output of free command (memory):

             total       used       free     shared    buffers     cached
Mem:       2039280     185776    1853504          0      40996      77272
-/+ buffers/cache:      67508    1971772
Swap:            0          0          0

Output of lsmod command (loaded modules):

Module                  Size  Used by    Not tainted
sr_mod                 10387  0
cdrom                  24577  1 sr_mod
sg                     17182  0
sd_mod                 25922  1
crc_t10dif               996  1 sd_mod
usb_storage            32098  0
hid_generic              665  0
usbhid                 18838  0
hid                    60174  2 hid_generic,usbhid
pcspkr                  1227  0
i2c_i801                7257  0
igb                    98565  0
i2c_algo_bit            3747  1 igb
isci                   71218  1
i2c_core               13114  3 i2c_i801,igb,i2c_algo_bit
libsas                 47529  1 isci
ptp                     5420  1 igb
pps_core                4544  1 ptp
ehci_pci                2540  0
scsi_transport_sas     16455  2 isci,libsas
processor              14951  8
thermal_sys            12739  1 processor
hwmon                    881  2 igb,thermal_sys
button                  3413  0
dm_mod                 50912  0
ehci_hcd               28628  1 ehci_pci
edd                     5144  0


Output of lspci -knn command:

00:00.0 Host bridge [0600]: Intel Corporation Xeon E5/Core i7 DMI2
[8086:3c00] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
00:01.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI
Express Root Port 1a [8086:3c02] (rev 07)
    Kernel driver in use: pcieport
00:02.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI
Express Root Port 2a [8086:3c04] (rev 07)
    Kernel driver in use: pcieport
00:03.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI
Express Root Port 3a in PCI Express Mode [8086:3c08] (rev 07)
    Kernel driver in use: pcieport
00:05.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Address Map, VTd_Misc, System Management [8086:3c28] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
00:05.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Control Status and Global Errors [8086:3c2a] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
00:05.4 PIC [0800]: Intel Corporation Xeon E5/Core i7 I/O APIC
[8086:3c2c] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
00:1a.0 USB controller [0c03]: Intel Corporation C600/X79 series
chipset USB2 Enhanced Host Controller #2 [8086:1d2d] (rev 06)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: ehci-pci
00:1c.0 PCI bridge [0604]: Intel Corporation C600/X79 series chipset
PCI Express Root Port 1 [8086:1d10] (rev b6)
    Kernel driver in use: pcieport
00:1c.7 PCI bridge [0604]: Intel Corporation C600/X79 series chipset
PCI Express Root Port 8 [8086:1d1e] (rev b6)
    Kernel driver in use: pcieport
00:1d.0 USB controller [0c03]: Intel Corporation C600/X79 series
chipset USB2 Enhanced Host Controller #1 [8086:1d26] (rev 06)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: ehci-pci
00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge
[8086:244e] (rev a6)
00:1f.0 ISA bridge [0601]: Intel Corporation C600/X79 series chipset
LPC Controller [8086:1d41] (rev 06)
    Subsystem: Cisco Systems Inc Device [1137:0101]
00:1f.3 SMBus [0c05]: Intel Corporation C600/X79 series chipset SMBus
Host Controller [8086:1d22] (rev 06)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: i801_smbus
00:1f.6 Signal processing controller [1180]: Intel Corporation
C600/X79 series chipset Thermal Management Controller [8086:1d24] (rev
06)
    Subsystem: Cisco Systems Inc Device [1137:0101]
01:00.0 PCI bridge [0604]: Intel Corporation C608/C606/X79 series
chipset PCI Express Upstream Port [8086:1d74] (rev 06)
    Kernel driver in use: pcieport
02:08.0 PCI bridge [0604]: Intel Corporation C608/C606/X79 series
chipset PCI Express Virtual Switch Port [8086:1d3f] (rev 06)
    Kernel driver in use: pcieport
03:00.0 Serial Attached SCSI controller [0107]: Intel Corporation C606
chipset Dual 4-Port SATA/SAS Storage Control Unit [8086:1d68] (rev 06)
    Subsystem: Cisco Systems Inc Device [1137:00b2]
    Kernel driver in use: isci
04:00.0 Ethernet controller [0200]: Intel Corporation I350 Gigabit
Network Connection [8086:1521] (rev 01)
    Subsystem: Cisco Systems Inc Device [1137:008a]
    Kernel driver in use: igb
04:00.1 Ethernet controller [0200]: Intel Corporation I350 Gigabit
Network Connection [8086:1521] (rev 01)
    Subsystem: Cisco Systems Inc Device [1137:008a]
    Kernel driver in use: igb
08:00.0 VGA compatible controller [0300]: Matrox Electronics Systems
Ltd. MGA G200e [Pilot] ServerEngines (SEP1) [102b:0522] (rev 02)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:08.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link 0 [8086:3c80] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:08.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 0 [8086:3c83] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:08.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 0 [8086:3c84] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:09.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link 1 [8086:3c90] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:09.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 1 [8086:3c93] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:09.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 1 [8086:3c94] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0a.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 0 [8086:3cc0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0a.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 1 [8086:3cc1] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0a.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 2 [8086:3cc2] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0a.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 3 [8086:3cd0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0b.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Interrupt Control Registers [8086:3ce0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0b.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Semaphore and Scratchpad Configuration Registers [8086:3ce3] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0c.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0c.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0c.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller System Address Decoder 0 [8086:3cf4] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0c.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
System Address Decoder [8086:3cf6] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0d.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0d.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0d.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller System Address Decoder 1 [8086:3cf5] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0e.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Processor Home Agent [8086:3ca0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0e.1 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
Processor Home Agent Performance Monitoring [8086:3c46] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:0f.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Registers [8086:3ca8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0f.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller RAS Registers [8086:3c71] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0f.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 0 [8086:3caa] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0f.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 1 [8086:3cab] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0f.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 2 [8086:3cac] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0f.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 3 [8086:3cad] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:0f.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 4 [8086:3cae] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:10.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 0 [8086:3cb0]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:10.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 1 [8086:3cb1]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:10.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 0 [8086:3cb2] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:10.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 1 [8086:3cb3] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:10.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 2 [8086:3cb4]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:10.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 3 [8086:3cb5]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:10.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 2 [8086:3cb6] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:10.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 3 [8086:3cb7] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:11.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
DDRIO [8086:3cb8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:13.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
R2PCIe [8086:3ce4] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:13.1 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
Ring to PCI Express Performance Monitor [8086:3c43] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:13.4 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
QuickPath Interconnect Agent Ring Registers [8086:3ce6] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
7f:13.5 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
Ring to QuickPath Interconnect Link 0 Performance Monitor [8086:3c44]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
7f:13.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Ring to QuickPath Interconnect Link 1 Performance Monitor [8086:3c45]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
80:01.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI
Express Root Port 1a [8086:3c02] (rev 07)
    Kernel driver in use: pcieport
80:03.0 PCI bridge [0604]: Intel Corporation Xeon E5/Core i7 IIO PCI
Express Root Port 3a in PCI Express Mode [8086:3c08] (rev 07)
    Kernel driver in use: pcieport
80:05.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Address Map, VTd_Misc, System Management [8086:3c28] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
80:05.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Control Status and Global Errors [8086:3c2a] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
80:05.4 PIC [0800]: Intel Corporation Xeon E5/Core i7 I/O APIC
[8086:3c2c] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:08.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link 0 [8086:3c80] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:08.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 0 [8086:3c83] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:08.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 0 [8086:3c84] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:09.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link 1 [8086:3c90] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:09.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 1 [8086:3c93] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:09.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
QPI Link Reut 1 [8086:3c94] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0a.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 0 [8086:3cc0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0a.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 1 [8086:3cc1] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0a.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 2 [8086:3cc2] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0a.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Power Control Unit 3 [8086:3cd0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0b.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Interrupt Control Registers [8086:3ce0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0b.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Semaphore and Scratchpad Configuration Registers [8086:3ce3] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0c.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0c.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0c.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller System Address Decoder 0 [8086:3cf4] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0c.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
System Address Decoder [8086:3cf6] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0d.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0d.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Unicast Register 0 [8086:3ce8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0d.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller System Address Decoder 1 [8086:3cf5] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0e.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Processor Home Agent [8086:3ca0] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0e.1 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
Processor Home Agent Performance Monitoring [8086:3c46] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:0f.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Registers [8086:3ca8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0f.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller RAS Registers [8086:3c71] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0f.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 0 [8086:3caa] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0f.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 1 [8086:3cab] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0f.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 2 [8086:3cac] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0f.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 3 [8086:3cad] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:0f.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Target Address Decoder 4 [8086:3cae] (rev
07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:10.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 0 [8086:3cb0]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:10.1 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 1 [8086:3cb1]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:10.2 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 0 [8086:3cb2] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:10.3 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 1 [8086:3cb3] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:10.4 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 2 [8086:3cb4]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:10.5 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller Channel 0-3 Thermal Control 3 [8086:3cb5]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:10.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 2 [8086:3cb6] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:10.7 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Integrated Memory Controller ERROR Registers 3 [8086:3cb7] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:11.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
DDRIO [8086:3cb8] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:13.0 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
R2PCIe [8086:3ce4] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:13.1 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
Ring to PCI Express Performance Monitor [8086:3c43] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:13.4 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
QuickPath Interconnect Agent Ring Registers [8086:3ce6] (rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
ff:13.5 Performance counters [1101]: Intel Corporation Xeon E5/Core i7
Ring to QuickPath Interconnect Link 0 Performance Monitor [8086:3c44]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore
ff:13.6 System peripheral [0880]: Intel Corporation Xeon E5/Core i7
Ring to QuickPath Interconnect Link 1 Performance Monitor [8086:3c45]
(rev 07)
    Subsystem: Cisco Systems Inc Device [1137:0101]
    Kernel driver in use: snbep_uncore


On Thu, Nov 21, 2013 at 12:46 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 11/20/2013 03:30 AM, NeilBrown wrote:
>
>> David: if you remove the "UUID=" part for the array leaving the
>> "container=.... member=0" as the identification, does it work?
>
> I second that. David, please try it. I'd also appreciate "mdadm -E
> /dev/sdX" output for all the RAID disks.
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 23:05                         ` David F.
@ 2013-11-21 23:09                           ` David F.
  2013-11-22  3:06                             ` David F.
  2013-11-25 21:56                           ` Martin Wilck
  1 sibling, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-21 23:09 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

They also said: RAID 5 with 4 physical drives.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 23:09                           ` David F.
@ 2013-11-22  3:06                             ` David F.
  2013-11-22 18:36                               ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-22  3:06 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

So it looks like sdf shouldn't be part of the RAID5 set - maybe it's a
stand alone drive configured on the raid controller?  I can ask if
they know...

On Thu, Nov 21, 2013 at 3:09 PM, David F. <df7729@gmail.com> wrote:
> They also said: RAID 5 with 4 physical drives.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-22  3:06                             ` David F.
@ 2013-11-22 18:36                               ` David F.
  2013-11-23 23:36                                 ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-22 18:36 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

Okay, got the word on what the sdf drive is:

"5th drive is configured as spare for RAID 5"

On Thu, Nov 21, 2013 at 7:06 PM, David F. <df7729@gmail.com> wrote:
> So it looks like sdf shouldn't be part of the RAID5 set - maybe it's a
> stand alone drive configured on the raid controller?  I can ask if
> they know...
>
> On Thu, Nov 21, 2013 at 3:09 PM, David F. <df7729@gmail.com> wrote:
>> They also said: RAID 5 with 4 physical drives.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-22 18:36                               ` David F.
@ 2013-11-23 23:36                                 ` David F.
  0 siblings, 0 replies; 44+ messages in thread
From: David F. @ 2013-11-23 23:36 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

Did that report give you everything needed?

When do you think the next release of mdadm will be ready?  Just
trying to plan the next release?

Thanks!

On Fri, Nov 22, 2013 at 10:36 AM, David F. <df7729@gmail.com> wrote:
> Okay, got the word on what the sdf drive is:
>
> "5th drive is configured as spare for RAID 5"
>
> On Thu, Nov 21, 2013 at 7:06 PM, David F. <df7729@gmail.com> wrote:
>> So it looks like sdf shouldn't be part of the RAID5 set - maybe it's a
>> stand alone drive configured on the raid controller?  I can ask if
>> they know...
>>
>> On Thu, Nov 21, 2013 at 3:09 PM, David F. <df7729@gmail.com> wrote:
>>> They also said: RAID 5 with 4 physical drives.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 22:39                                 ` David F.
@ 2013-11-25 21:39                                   ` Martin Wilck
  0 siblings, 0 replies; 44+ messages in thread
From: Martin Wilck @ 2013-11-25 21:39 UTC (permalink / raw)
  To: David F., linux-raid

On 11/21/2013 11:39 PM, David F. wrote:
> Are you saying that the old obsolete DMRAID should still be used for
> DDF RAID?   What about the 2TiB limit?  I'd rather see modern linux
> RAID support work as good as it does for Windows.

No, I haven't said that. On the contrary, I am actively working on
getting mdadm support for DDF into the main distributions.

I was just wondering about your setup because as far as I know, no
distribution enables mdadm support for DDF. Doing that requires changes
in the distribution's udev rules, intrd/initramfs generation code, and
installer. See my DDF page in the Linux RAID wiki for details.

Regards
Martin


> 
> On Thu, Nov 21, 2013 at 1:30 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> On 11/21/2013 10:10 PM, David F. wrote:
>>> On that DMRAID - they are still yet to try it.  But we do know the
>>> RAID5 works via Int13h interface in real mode and via Windows.  I
>>> think they thought it was a 4 disk array?  I'll ask if they know the
>>> actually number of drives in the RAID configuration.
>>
>> What distribution are these people using? I am not aware of any distro
>> that would activate mdadm for DDF RAID by default.
>>
>> Martin


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-21 23:05                         ` David F.
  2013-11-21 23:09                           ` David F.
@ 2013-11-25 21:56                           ` Martin Wilck
  2013-11-26  0:24                             ` David F.
  2013-11-26 21:59                             ` David F.
  1 sibling, 2 replies; 44+ messages in thread
From: Martin Wilck @ 2013-11-25 21:56 UTC (permalink / raw)
  To: David F.; +Cc: NeilBrown, linux-raid

The metadata really looks all ok, but mdadm fails to assemble it
correctly. It does look like a bug to me. The problem seems to be
related to the global spare somehow. I have no idea why.

The best thing to proceed would be to run mdadm --dump on
/dev/sd[bcdef], put the output in a tgz and make it available to us
somewhere. So we could actually debut the assembly.

Regards,
Martin




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-25 21:56                           ` Martin Wilck
@ 2013-11-26  0:24                             ` David F.
  2013-11-26 21:59                             ` David F.
  1 sibling, 0 replies; 44+ messages in thread
From: David F. @ 2013-11-26  0:24 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid

Thanks. What would be actual command to create the --dump and where
does the file go?   testing it on the ism raid 1 didn't seem to work
for us?



On Mon, Nov 25, 2013 at 1:56 PM, Martin Wilck <mwilck@arcor.de> wrote:
> The metadata really looks all ok, but mdadm fails to assemble it
> correctly. It does look like a bug to me. The problem seems to be
> related to the global spare somehow. I have no idea why.
>
> The best thing to proceed would be to run mdadm --dump on
> /dev/sd[bcdef], put the output in a tgz and make it available to us
> somewhere. So we could actually debut the assembly.
>
> Regards,
> Martin
>
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-25 21:56                           ` Martin Wilck
  2013-11-26  0:24                             ` David F.
@ 2013-11-26 21:59                             ` David F.
  2013-11-27 22:40                               ` Martin Wilck
  1 sibling, 1 reply; 44+ messages in thread
From: David F. @ 2013-11-26 21:59 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid@vger.kernel.org

does it need to be a tar type file or would virtual hard drive .vmdk
files be okay?

On Mon, Nov 25, 2013 at 1:56 PM, Martin Wilck <mwilck@arcor.de> wrote:
> The metadata really looks all ok, but mdadm fails to assemble it
> correctly. It does look like a bug to me. The problem seems to be
> related to the global spare somehow. I have no idea why.
>
> The best thing to proceed would be to run mdadm --dump on
> /dev/sd[bcdef], put the output in a tgz and make it available to us
> somewhere. So we could actually debut the assembly.
>
> Regards,
> Martin
>
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-26 21:59                             ` David F.
@ 2013-11-27 22:40                               ` Martin Wilck
  2013-12-06  1:53                                 ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: Martin Wilck @ 2013-11-27 22:40 UTC (permalink / raw)
  To: David F.; +Cc: NeilBrown, linux-raid@vger.kernel.org

On 11/26/2013 10:59 PM, David F. wrote:
> does it need to be a tar type file or would virtual hard drive .vmdk
> files be okay?

I guess it would be ok (never tried).

Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-11-27 22:40                               ` Martin Wilck
@ 2013-12-06  1:53                                 ` David F.
  2013-12-07  2:28                                   ` David F.
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-12-06  1:53 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid@vger.kernel.org

Hi,

Got the files from the customer - you can download the zipped tar from
https://www.dropbox.com/s/aq8idkjnxyslho7/raid5data.tgz

Let me know when you get it so we can delete it.

Thanks.


On Wed, Nov 27, 2013 at 2:40 PM, Martin Wilck <mwilck@arcor.de> wrote:
> On 11/26/2013 10:59 PM, David F. wrote:
>> does it need to be a tar type file or would virtual hard drive .vmdk
>> files be okay?
>
> I guess it would be ok (never tried).
>
> Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-12-06  1:53                                 ` David F.
@ 2013-12-07  2:28                                   ` David F.
  2013-12-07  3:16                                     ` NeilBrown
  0 siblings, 1 reply; 44+ messages in thread
From: David F. @ 2013-12-07  2:28 UTC (permalink / raw)
  To: Martin Wilck; +Cc: NeilBrown, linux-raid@vger.kernel.org

Just wondering if the link to the files to debug the problem made it ???

On Thu, Dec 5, 2013 at 5:53 PM, David F. <df7729@gmail.com> wrote:
> Hi,
>
> Got the files from the customer - you can download the zipped tar from
>
> Let me know when you get it so we can delete it.
>
> Thanks.
>
>
> On Wed, Nov 27, 2013 at 2:40 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> On 11/26/2013 10:59 PM, David F. wrote:
>>> does it need to be a tar type file or would virtual hard drive .vmdk
>>> files be okay?
>>
>> I guess it would be ok (never tried).
>>
>> Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-12-07  2:28                                   ` David F.
@ 2013-12-07  3:16                                     ` NeilBrown
  2013-12-07  3:46                                       ` David F.
  2013-12-14 21:01                                       ` David F.
  0 siblings, 2 replies; 44+ messages in thread
From: NeilBrown @ 2013-12-07  3:16 UTC (permalink / raw)
  To: David F.; +Cc: Martin Wilck, linux-raid@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 854 bytes --]

On Fri, 6 Dec 2013 18:28:16 -0800 "David F." <df7729@gmail.com> wrote:

> Just wondering if the link to the files to debug the problem made it ???
> 
> On Thu, Dec 5, 2013 at 5:53 PM, David F. <df7729@gmail.com> wrote:
> > Hi,
> >
> > Got the files from the customer - you can download the zipped tar from
> >
> > Let me know when you get it so we can delete it.
> >
> > Thanks.
> >
> >
> > On Wed, Nov 27, 2013 at 2:40 PM, Martin Wilck <mwilck@arcor.de> wrote:
> >> On 11/26/2013 10:59 PM, David F. wrote:
> >>> does it need to be a tar type file or would virtual hard drive .vmdk
> >>> files be okay?
> >>
> >> I guess it would be ok (never tried).
> >>
> >> Martin

I grabbed a copy, though at 1967 bytes you could have attached it to the
email safely :-)

I haven't had a chance to look at them in detail yet.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-12-07  3:16                                     ` NeilBrown
@ 2013-12-07  3:46                                       ` David F.
  2013-12-14 21:01                                       ` David F.
  1 sibling, 0 replies; 44+ messages in thread
From: David F. @ 2013-12-07  3:46 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid@vger.kernel.org

Okay, I wasn't sure it was allowed / would go through.  We used tar's
sparse file support or they would be very large.

I'll have the link say up in case Martin wants a copy...

Thanks!


On Fri, Dec 6, 2013 at 7:16 PM, NeilBrown <neilb@suse.de> wrote:
> On Fri, 6 Dec 2013 18:28:16 -0800 "David F." <df7729@gmail.com> wrote:
>
>> Just wondering if the link to the files to debug the problem made it ???
>>
>> On Thu, Dec 5, 2013 at 5:53 PM, David F. <df7729@gmail.com> wrote:
>> > Hi,
>> >
>> > Got the files from the customer - you can download the zipped tar from
>> >
>> > Let me know when you get it so we can delete it.
>> >
>> > Thanks.
>> >
>> >
>> > On Wed, Nov 27, 2013 at 2:40 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> >> On 11/26/2013 10:59 PM, David F. wrote:
>> >>> does it need to be a tar type file or would virtual hard drive .vmdk
>> >>> files be okay?
>> >>
>> >> I guess it would be ok (never tried).
>> >>
>> >> Martin
>
> I grabbed a copy, though at 1967 bytes you could have attached it to the
> email safely :-)
>
> I haven't had a chance to look at them in detail yet.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-12-07  3:16                                     ` NeilBrown
  2013-12-07  3:46                                       ` David F.
@ 2013-12-14 21:01                                       ` David F.
  2014-01-20  4:34                                         ` NeilBrown
  1 sibling, 1 reply; 44+ messages in thread
From: David F. @ 2013-12-14 21:01 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid@vger.kernel.org

Hi,

Just wondering if this gave you guys everything you needed to figure
out the issue?

Also, any idea on when 3.4 may be out with the various fixes?

Thanks.


On Fri, Dec 6, 2013 at 7:16 PM, NeilBrown <neilb@suse.de> wrote:
> On Fri, 6 Dec 2013 18:28:16 -0800 "David F." <df7729@gmail.com> wrote:
>
>> Just wondering if the link to the files to debug the problem made it ???
>>
>> On Thu, Dec 5, 2013 at 5:53 PM, David F. <df7729@gmail.com> wrote:
>> > Hi,
>> >
>> > Got the files from the customer - you can download the zipped tar from
>> >
>> > Let me know when you get it so we can delete it.
>> >
>> > Thanks.
>> >
>> >
>> > On Wed, Nov 27, 2013 at 2:40 PM, Martin Wilck <mwilck@arcor.de> wrote:
>> >> On 11/26/2013 10:59 PM, David F. wrote:
>> >>> does it need to be a tar type file or would virtual hard drive .vmdk
>> >>> files be okay?
>> >>
>> >> I guess it would be ok (never tried).
>> >>
>> >> Martin
>
> I grabbed a copy, though at 1967 bytes you could have attached it to the
> email safely :-)
>
> I haven't had a chance to look at them in detail yet.
>
> NeilBrown

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2013-12-14 21:01                                       ` David F.
@ 2014-01-20  4:34                                         ` NeilBrown
  2014-01-20 21:52                                           ` Martin Wilck
  2014-01-20 23:54                                           ` David F.
  0 siblings, 2 replies; 44+ messages in thread
From: NeilBrown @ 2014-01-20  4:34 UTC (permalink / raw)
  To: David F.; +Cc: Martin Wilck, linux-raid@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 3908 bytes --]

On Sat, 14 Dec 2013 13:01:50 -0800 "David F." <df7729@gmail.com> wrote:

> Hi,
> 
> Just wondering if this gave you guys everything you needed to figure
> out the issue?

I had everything but time.  I've now made the time and have the fix (I hope).

Please try the current HEAD of git://neil.brown.name/mdadm/
The important patch is below.

> 
> Also, any idea on when 3.4 may be out with the various fixes?

I hope to release 3.3.1 some time in February.  Based on past experience it
should be out before Easter, but no promises.

NeilBrown

From f0e876ce03a63f150bb87b2734c139bc8bb285b2 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 20 Jan 2014 15:27:29 +1100
Subject: [PATCH] DDF: fix detection of failed devices during assembly.

When we call "getinfo_super", we report the working/failed status
of the particular device, and also (via the 'map') the working/failed
status of every other device that this metadata is aware of.

It is important that the way we calculate "working or failed" is
consistent.
As it is, getinfo_super_ddf() will report a spare as "working", but
every other device will see it as "failed", which leads to failure to
assemble arrays with spares.

For getinfo_super_ddf (i.e. for the container), a device is assumed
"working" unless flagged as DDF_Failed.
For getinfo_super_ddf_bvd (for a member array), a device is assumed
"failed" unless DDF_Online is set, and DDF_Failed is not set.

Reported-by: "David F." <df7729@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/super-ddf.c b/super-ddf.c
index d526d8ad3da9..4242af86fea9 100644
--- a/super-ddf.c
+++ b/super-ddf.c
@@ -1913,6 +1913,7 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
 	info->disk.major = 0;
 	info->disk.minor = 0;
 	if (ddf->dlist) {
+		struct phys_disk_entry *pde = NULL;
 		info->disk.number = be32_to_cpu(ddf->dlist->disk.refnum);
 		info->disk.raid_disk = find_phys(ddf, ddf->dlist->disk.refnum);
 
@@ -1920,12 +1921,19 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
 						  entries[info->disk.raid_disk].
 						  config_size);
 		info->component_size = ddf->dlist->size - info->data_offset;
+		if (info->disk.raid_disk >= 0)
+			pde = ddf->phys->entries + info->disk.raid_disk;
+		if (pde &&
+		    !(be16_to_cpu(pde->state) & DDF_Failed))
+			info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
+		else
+			info->disk.state = 1 << MD_DISK_FAULTY;
 	} else {
 		info->disk.number = -1;
 		info->disk.raid_disk = -1;
 //		info->disk.raid_disk = find refnum in the table and use index;
+		info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
 	}
-	info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
 
 	info->recovery_start = MaxSector;
 	info->reshape_active = 0;
@@ -1943,8 +1951,6 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
 		int i;
 		for (i = 0 ; i < map_disks; i++) {
 			if (i < info->array.raid_disks &&
-			    (be16_to_cpu(ddf->phys->entries[i].state)
-			     & DDF_Online) &&
 			    !(be16_to_cpu(ddf->phys->entries[i].state)
 			      & DDF_Failed))
 				map[i] = 1;
@@ -2017,7 +2023,11 @@ static void getinfo_super_ddf_bvd(struct supertype *st, struct mdinfo *info, cha
 		info->disk.raid_disk = cd + conf->sec_elmnt_seq
 			* be16_to_cpu(conf->prim_elmnt_count);
 		info->disk.number = dl->pdnum;
-		info->disk.state = (1<<MD_DISK_SYNC)|(1<<MD_DISK_ACTIVE);
+		info->disk.state = 0;
+		if (info->disk.number >= 0 &&
+		    (be16_to_cpu(ddf->phys->entries[info->disk.number].state) & DDF_Online) &&
+		    !(be16_to_cpu(ddf->phys->entries[info->disk.number].state) & DDF_Failed))
+			info->disk.state = (1<<MD_DISK_SYNC)|(1<<MD_DISK_ACTIVE);
 	}
 
 	info->container_member = ddf->currentconf->vcnum;

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2014-01-20  4:34                                         ` NeilBrown
@ 2014-01-20 21:52                                           ` Martin Wilck
  2014-01-20 23:54                                           ` David F.
  1 sibling, 0 replies; 44+ messages in thread
From: Martin Wilck @ 2014-01-20 21:52 UTC (permalink / raw)
  To: NeilBrown; +Cc: David F., linux-raid@vger.kernel.org

On 01/20/2014 05:34 AM, NeilBrown wrote:
> I had everything but time.  I've now made the time and have the fix (I hope).

Thanks a lot for looking into this.

Martin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2014-01-20  4:34                                         ` NeilBrown
  2014-01-20 21:52                                           ` Martin Wilck
@ 2014-01-20 23:54                                           ` David F.
  2014-01-22 22:32                                             ` David F.
  1 sibling, 1 reply; 44+ messages in thread
From: David F. @ 2014-01-20 23:54 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid@vger.kernel.org

Ok, thanks - we have sent it on for them to check.

On Sun, Jan 19, 2014 at 8:34 PM, NeilBrown <neilb@suse.de> wrote:
> On Sat, 14 Dec 2013 13:01:50 -0800 "David F." <df7729@gmail.com> wrote:
>
>> Hi,
>>
>> Just wondering if this gave you guys everything you needed to figure
>> out the issue?
>
> I had everything but time.  I've now made the time and have the fix (I hope).
>
> Please try the current HEAD of git://neil.brown.name/mdadm/
> The important patch is below.
>
>>
>> Also, any idea on when 3.4 may be out with the various fixes?
>
> I hope to release 3.3.1 some time in February.  Based on past experience it
> should be out before Easter, but no promises.
>
> NeilBrown
>
> From f0e876ce03a63f150bb87b2734c139bc8bb285b2 Mon Sep 17 00:00:00 2001
> From: NeilBrown <neilb@suse.de>
> Date: Mon, 20 Jan 2014 15:27:29 +1100
> Subject: [PATCH] DDF: fix detection of failed devices during assembly.
>
> When we call "getinfo_super", we report the working/failed status
> of the particular device, and also (via the 'map') the working/failed
> status of every other device that this metadata is aware of.
>
> It is important that the way we calculate "working or failed" is
> consistent.
> As it is, getinfo_super_ddf() will report a spare as "working", but
> every other device will see it as "failed", which leads to failure to
> assemble arrays with spares.
>
> For getinfo_super_ddf (i.e. for the container), a device is assumed
> "working" unless flagged as DDF_Failed.
> For getinfo_super_ddf_bvd (for a member array), a device is assumed
> "failed" unless DDF_Online is set, and DDF_Failed is not set.
>
> Reported-by: "David F." <df7729@gmail.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> diff --git a/super-ddf.c b/super-ddf.c
> index d526d8ad3da9..4242af86fea9 100644
> --- a/super-ddf.c
> +++ b/super-ddf.c
> @@ -1913,6 +1913,7 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
>         info->disk.major = 0;
>         info->disk.minor = 0;
>         if (ddf->dlist) {
> +               struct phys_disk_entry *pde = NULL;
>                 info->disk.number = be32_to_cpu(ddf->dlist->disk.refnum);
>                 info->disk.raid_disk = find_phys(ddf, ddf->dlist->disk.refnum);
>
> @@ -1920,12 +1921,19 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
>                                                   entries[info->disk.raid_disk].
>                                                   config_size);
>                 info->component_size = ddf->dlist->size - info->data_offset;
> +               if (info->disk.raid_disk >= 0)
> +                       pde = ddf->phys->entries + info->disk.raid_disk;
> +               if (pde &&
> +                   !(be16_to_cpu(pde->state) & DDF_Failed))
> +                       info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
> +               else
> +                       info->disk.state = 1 << MD_DISK_FAULTY;
>         } else {
>                 info->disk.number = -1;
>                 info->disk.raid_disk = -1;
>  //             info->disk.raid_disk = find refnum in the table and use index;
> +               info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
>         }
> -       info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
>
>         info->recovery_start = MaxSector;
>         info->reshape_active = 0;
> @@ -1943,8 +1951,6 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
>                 int i;
>                 for (i = 0 ; i < map_disks; i++) {
>                         if (i < info->array.raid_disks &&
> -                           (be16_to_cpu(ddf->phys->entries[i].state)
> -                            & DDF_Online) &&
>                             !(be16_to_cpu(ddf->phys->entries[i].state)
>                               & DDF_Failed))
>                                 map[i] = 1;
> @@ -2017,7 +2023,11 @@ static void getinfo_super_ddf_bvd(struct supertype *st, struct mdinfo *info, cha
>                 info->disk.raid_disk = cd + conf->sec_elmnt_seq
>                         * be16_to_cpu(conf->prim_elmnt_count);
>                 info->disk.number = dl->pdnum;
> -               info->disk.state = (1<<MD_DISK_SYNC)|(1<<MD_DISK_ACTIVE);
> +               info->disk.state = 0;
> +               if (info->disk.number >= 0 &&
> +                   (be16_to_cpu(ddf->phys->entries[info->disk.number].state) & DDF_Online) &&
> +                   !(be16_to_cpu(ddf->phys->entries[info->disk.number].state) & DDF_Failed))
> +                       info->disk.state = (1<<MD_DISK_SYNC)|(1<<MD_DISK_ACTIVE);
>         }
>
>         info->container_member = ddf->currentconf->vcnum;

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: MDADM 3.3 broken?
  2014-01-20 23:54                                           ` David F.
@ 2014-01-22 22:32                                             ` David F.
  0 siblings, 0 replies; 44+ messages in thread
From: David F. @ 2014-01-22 22:32 UTC (permalink / raw)
  To: NeilBrown; +Cc: Martin Wilck, linux-raid@vger.kernel.org

They said it's now showing up as it should.  Although they are going
to do some more tests later....

Thank You.


On Mon, Jan 20, 2014 at 3:54 PM, David F. <df7729@gmail.com> wrote:
> Ok, thanks - we have sent it on for them to check.
>
> On Sun, Jan 19, 2014 at 8:34 PM, NeilBrown <neilb@suse.de> wrote:
>> On Sat, 14 Dec 2013 13:01:50 -0800 "David F." <df7729@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> Just wondering if this gave you guys everything you needed to figure
>>> out the issue?
>>
>> I had everything but time.  I've now made the time and have the fix (I hope).
>>
>> Please try the current HEAD of git://neil.brown.name/mdadm/
>> The important patch is below.
>>
>>>
>>> Also, any idea on when 3.4 may be out with the various fixes?
>>
>> I hope to release 3.3.1 some time in February.  Based on past experience it
>> should be out before Easter, but no promises.
>>
>> NeilBrown
>>
>> From f0e876ce03a63f150bb87b2734c139bc8bb285b2 Mon Sep 17 00:00:00 2001
>> From: NeilBrown <neilb@suse.de>
>> Date: Mon, 20 Jan 2014 15:27:29 +1100
>> Subject: [PATCH] DDF: fix detection of failed devices during assembly.
>>
>> When we call "getinfo_super", we report the working/failed status
>> of the particular device, and also (via the 'map') the working/failed
>> status of every other device that this metadata is aware of.
>>
>> It is important that the way we calculate "working or failed" is
>> consistent.
>> As it is, getinfo_super_ddf() will report a spare as "working", but
>> every other device will see it as "failed", which leads to failure to
>> assemble arrays with spares.
>>
>> For getinfo_super_ddf (i.e. for the container), a device is assumed
>> "working" unless flagged as DDF_Failed.
>> For getinfo_super_ddf_bvd (for a member array), a device is assumed
>> "failed" unless DDF_Online is set, and DDF_Failed is not set.
>>
>> Reported-by: "David F." <df7729@gmail.com>
>> Signed-off-by: NeilBrown <neilb@suse.de>
>>
>> diff --git a/super-ddf.c b/super-ddf.c
>> index d526d8ad3da9..4242af86fea9 100644
>> --- a/super-ddf.c
>> +++ b/super-ddf.c
>> @@ -1913,6 +1913,7 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
>>         info->disk.major = 0;
>>         info->disk.minor = 0;
>>         if (ddf->dlist) {
>> +               struct phys_disk_entry *pde = NULL;
>>                 info->disk.number = be32_to_cpu(ddf->dlist->disk.refnum);
>>                 info->disk.raid_disk = find_phys(ddf, ddf->dlist->disk.refnum);
>>
>> @@ -1920,12 +1921,19 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
>>                                                   entries[info->disk.raid_disk].
>>                                                   config_size);
>>                 info->component_size = ddf->dlist->size - info->data_offset;
>> +               if (info->disk.raid_disk >= 0)
>> +                       pde = ddf->phys->entries + info->disk.raid_disk;
>> +               if (pde &&
>> +                   !(be16_to_cpu(pde->state) & DDF_Failed))
>> +                       info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
>> +               else
>> +                       info->disk.state = 1 << MD_DISK_FAULTY;
>>         } else {
>>                 info->disk.number = -1;
>>                 info->disk.raid_disk = -1;
>>  //             info->disk.raid_disk = find refnum in the table and use index;
>> +               info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
>>         }
>> -       info->disk.state = (1 << MD_DISK_SYNC) | (1 << MD_DISK_ACTIVE);
>>
>>         info->recovery_start = MaxSector;
>>         info->reshape_active = 0;
>> @@ -1943,8 +1951,6 @@ static void getinfo_super_ddf(struct supertype *st, struct mdinfo *info, char *m
>>                 int i;
>>                 for (i = 0 ; i < map_disks; i++) {
>>                         if (i < info->array.raid_disks &&
>> -                           (be16_to_cpu(ddf->phys->entries[i].state)
>> -                            & DDF_Online) &&
>>                             !(be16_to_cpu(ddf->phys->entries[i].state)
>>                               & DDF_Failed))
>>                                 map[i] = 1;
>> @@ -2017,7 +2023,11 @@ static void getinfo_super_ddf_bvd(struct supertype *st, struct mdinfo *info, cha
>>                 info->disk.raid_disk = cd + conf->sec_elmnt_seq
>>                         * be16_to_cpu(conf->prim_elmnt_count);
>>                 info->disk.number = dl->pdnum;
>> -               info->disk.state = (1<<MD_DISK_SYNC)|(1<<MD_DISK_ACTIVE);
>> +               info->disk.state = 0;
>> +               if (info->disk.number >= 0 &&
>> +                   (be16_to_cpu(ddf->phys->entries[info->disk.number].state) & DDF_Online) &&
>> +                   !(be16_to_cpu(ddf->phys->entries[info->disk.number].state) & DDF_Failed))
>> +                       info->disk.state = (1<<MD_DISK_SYNC)|(1<<MD_DISK_ACTIVE);
>>         }
>>
>>         info->container_member = ddf->currentconf->vcnum;

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2014-01-22 22:32 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-18 18:26 MDADM 3.3 broken? David F.
2013-11-18 20:22 ` Martin Wilck
2013-11-18 23:13   ` David F.
2013-11-19  0:01     ` NeilBrown
2013-11-19 17:05       ` David F.
2013-11-19 20:38         ` Martin Wilck
2013-11-19 22:34           ` David F.
2013-11-19 22:49           ` David F.
2013-11-19 19:45       ` Martin Wilck
2013-11-19 20:08         ` David F.
2013-11-19 23:51         ` NeilBrown
2013-11-20  0:22           ` David F.
2013-11-20  0:35             ` David F.
2013-11-20  0:48               ` NeilBrown
2013-11-20  1:29                 ` David F.
2013-11-20  1:34                   ` David F.
2013-11-20  2:30                     ` NeilBrown
2013-11-20  6:41                       ` David F.
2013-11-20 23:15                         ` David F.
2013-11-21 20:50                           ` Martin Wilck
2013-11-21 21:10                             ` David F.
2013-11-21 21:30                               ` Martin Wilck
2013-11-21 22:39                                 ` David F.
2013-11-25 21:39                                   ` Martin Wilck
2013-11-21 20:46                       ` Martin Wilck
2013-11-21 21:06                         ` David F.
2013-11-21 23:05                         ` David F.
2013-11-21 23:09                           ` David F.
2013-11-22  3:06                             ` David F.
2013-11-22 18:36                               ` David F.
2013-11-23 23:36                                 ` David F.
2013-11-25 21:56                           ` Martin Wilck
2013-11-26  0:24                             ` David F.
2013-11-26 21:59                             ` David F.
2013-11-27 22:40                               ` Martin Wilck
2013-12-06  1:53                                 ` David F.
2013-12-07  2:28                                   ` David F.
2013-12-07  3:16                                     ` NeilBrown
2013-12-07  3:46                                       ` David F.
2013-12-14 21:01                                       ` David F.
2014-01-20  4:34                                         ` NeilBrown
2014-01-20 21:52                                           ` Martin Wilck
2014-01-20 23:54                                           ` David F.
2014-01-22 22:32                                             ` David F.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).