* Raid-10 mount at startup always has problem
@ 2007-08-27 18:14 Daniel L. Miller
[not found] ` <46D49F1A.7030409@tmr.com>
0 siblings, 1 reply; 42+ messages in thread
From: Daniel L. Miller @ 2007-08-27 18:14 UTC (permalink / raw)
To: linux-raid
Hi!
I have a four-disk Raid-10 array that I created and mount with mdadm.
It seems like every re-boot, either the array is not recognized
altogether, or one of the disks is not added. Manually adding using
mdadm works.
Ubuntu, custom compiled kernel, 2.6.22
mdadm 2.6.2
Sata hard drives, nvidia CK804 controller - NOT using nvidia raid.
--
Daniel
^ permalink raw reply [flat|nested] 42+ messages in thread[parent not found: <46D49F1A.7030409@tmr.com>]
* Re: Raid-10 mount at startup always has problem [not found] ` <46D49F1A.7030409@tmr.com> @ 2007-09-10 1:53 ` Daniel L. Miller 2007-09-10 2:04 ` Richard Scobie [not found] ` <46E4A5F0.9090407@sauce.co.nz> 0 siblings, 2 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-09-10 1:53 UTC (permalink / raw) To: linux-raid Bill Davidsen wrote: > Daniel L. Miller wrote: >> Hi! >> >> I have a four-disk Raid-10 array that I created and mount with >> mdadm. It seems like every re-boot, either the array is not >> recognized altogether, or one of the disks is not added. Manually >> adding using mdadm works. > > What superblock version and partition type did you use? mdadm -D please. Thanks for the reply. I've been wondering why no one answered me - then discovered your answer in my mailbox! Must have been hiding somewhere . . . . Anyway - mdadm -D /dev/md0 /dev/md0: Version : 00.90.03 Creation Time : Tue Oct 3 19:11:53 2006 Raid Level : raid10 Array Size : 312581632 (298.10 GiB 320.08 GB) Used Dev Size : 156290816 (149.05 GiB 160.04 GB) Raid Devices : 4 Total Devices : 4 Preferred Minor : 0 Persistence : Superblock is persistent Update Time : Sun Sep 9 18:51:17 2007 State : clean Active Devices : 4 Working Devices : 4 Failed Devices : 0 Spare Devices : 0 Layout : near=2, far=1 Chunk Size : 32K UUID : 9d94b17b:f5fac31a:577c252b:0d4c4b2a Events : 0.10811466 Number Major Minor RaidDevice State 0 8 0 0 active sync /dev/sda 1 8 16 1 active sync /dev/sdb 2 8 32 2 active sync /dev/sdc 3 8 48 3 active sync /dev/sdd And you didn't ask, but my mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-09-10 1:53 ` Daniel L. Miller @ 2007-09-10 2:04 ` Richard Scobie [not found] ` <46E4A5F0.9090407@sauce.co.nz> 1 sibling, 0 replies; 42+ messages in thread From: Richard Scobie @ 2007-09-10 2:04 UTC (permalink / raw) To: Linux RAID Mailing List Daniel L. Miller wrote: > And you didn't ask, but my mdadm.conf: > DEVICE partitions > ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 > UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a Hi Daniel, Try adding auto=part at the end of you mdadm.conf ARRAY line. Regards, Richard ^ permalink raw reply [flat|nested] 42+ messages in thread
[parent not found: <46E4A5F0.9090407@sauce.co.nz>]
* Re: Raid-10 mount at startup always has problem [not found] ` <46E4A5F0.9090407@sauce.co.nz> @ 2007-09-10 2:11 ` Daniel L. Miller 2007-10-24 14:22 ` Daniel L. Miller 0 siblings, 1 reply; 42+ messages in thread From: Daniel L. Miller @ 2007-09-10 2:11 UTC (permalink / raw) To: linux-raid Richard Scobie wrote: > Daniel L. Miller wrote: > >> And you didn't ask, but my mdadm.conf: >> DEVICE partitions >> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 >> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a > > Try adding > > auto=part > > at the end of you mdadm.conf ARRAY line. Thanks - will see what happens on my next reboot. Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-09-10 2:11 ` Daniel L. Miller @ 2007-10-24 14:22 ` Daniel L. Miller 2007-10-24 16:25 ` Doug Ledford ` (2 more replies) 0 siblings, 3 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-10-24 14:22 UTC (permalink / raw) To: linux-raid Daniel L. Miller wrote: > Richard Scobie wrote: >> Daniel L. Miller wrote: >> >>> And you didn't ask, but my mdadm.conf: >>> DEVICE partitions >>> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 >>> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a >> >> Try adding >> >> auto=part >> >> at the end of you mdadm.conf ARRAY line. > Thanks - will see what happens on my next reboot. > Current mdadm.conf: DEVICE partitions ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part still have the problem where on boot one drive is not part of the array. Is there a log file I can check to find out WHY a drive is not being added? It's been a while since the reboot, but I did find some entries in dmesg - I'm appending both the md lines and the physical disk related lines. The bottom shows one disk not being added (this time is was sda) - and the disk that gets skipped on each boot seems to be random - there's no consistent failure: [...] md: raid10 personality registered for level 10 [...] md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. [...] scsi0 : sata_nv scsi1 : sata_nv ata1: SATA max UDMA/133 cmd 0xffffc20001428480 ctl 0xffffc200014284a0 bmdma 0x0000000000011410 irq 23 ata2: SATA max UDMA/133 cmd 0xffffc20001428580 ctl 0xffffc200014285a0 bmdma 0x0000000000011418 irq 23 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata1.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata2.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata2.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata1: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 scsi 1:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata2: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 ACPI: PCI Interrupt Link [LSI1] enabled at IRQ 22 ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LSI1] -> GSI 22 (level, high) -> IRQ 22 sata_nv 0000:00:08.0: Using ADMA mode PCI: Setting latency timer of device 0000:00:08.0 to 64 scsi2 : sata_nv scsi3 : sata_nv ata3: SATA max UDMA/133 cmd 0xffffc2000142a480 ctl 0xffffc2000142a4a0 bmdma 0x0000000000011420 irq 22 ata4: SATA max UDMA/133 cmd 0xffffc2000142a580 ctl 0xffffc2000142a5a0 bmdma 0x0000000000011428 irq 22 ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata3.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata3.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata3.00: configured for UDMA/133 ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata4.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata4.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata4.00: configured for UDMA/133 scsi 2:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata3: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 scsi 3:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata4: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: unknown partition table sd 0:0:0:0: [sda] Attached SCSI disk sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: unknown partition table sd 1:0:0:0: [sdb] Attached SCSI disk sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: unknown partition table sd 2:0:0:0: [sdc] Attached SCSI disk sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdd: unknown partition table sd 3:0:0:0: [sdd] Attached SCSI disk [...] md: md0 stopped. md: md0 stopped. md: bind<sdc> md: bind<sdd> md: bind<sdb> md: md0: raid array is not clean -- starting background reconstruction raid10: raid set md0 active with 3 out of 4 devices md: couldn't update array info. -22 md: resync of RAID array md0 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. md: using 128k window, over a total of 312581632 blocks. Filesystem "md0": Disabling barriers, not supported by the underlying device XFS mounting filesystem md0 Starting XFS recovery on filesystem: md0 (logdev: internal) Ending XFS recovery on filesystem: md0 (logdev: internal) -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-24 14:22 ` Daniel L. Miller @ 2007-10-24 16:25 ` Doug Ledford 2007-10-24 20:01 ` Bill Davidsen 2007-10-25 6:12 ` Neil Brown 2 siblings, 0 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-24 16:25 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 2189 bytes --] On Wed, 2007-10-24 at 07:22 -0700, Daniel L. Miller wrote: > Daniel L. Miller wrote: > > Richard Scobie wrote: > >> Daniel L. Miller wrote: > >> > >>> And you didn't ask, but my mdadm.conf: > >>> DEVICE partitions > >>> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 > >>> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a > >> > >> Try adding > >> > >> auto=part > >> > >> at the end of you mdadm.conf ARRAY line. > > Thanks - will see what happens on my next reboot. > > > Current mdadm.conf: > DEVICE partitions > ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 > UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part > > still have the problem where on boot one drive is not part of the > array. Is there a log file I can check to find out WHY a drive is not > being added? It usually means either the device is busy at the time the raid startup happened, or the device wasn't created by udev yet at the time the startup happened. It it failing to start the array properly in the initrd or is this happening after you've switched to the rootfs and are running the startup scripts? > md: md0 stopped. > md: md0 stopped. > md: bind<sdc> > md: bind<sdd> > md: bind<sdb> Whole disk raid devices == bad. Lots of stuff can go wrong with that setup. > md: md0: raid array is not clean -- starting background reconstruction > raid10: raid set md0 active with 3 out of 4 devices > md: couldn't update array info. -22 > md: resync of RAID array md0 > md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > md: using maximum available idle IO bandwidth (but not more than 200000 > KB/sec) for resync. > md: using 128k window, over a total of 312581632 blocks. > Filesystem "md0": Disabling barriers, not supported by the underlying device > XFS mounting filesystem md0 > Starting XFS recovery on filesystem: md0 (logdev: internal) > Ending XFS recovery on filesystem: md0 (logdev: internal) > > > -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-24 14:22 ` Daniel L. Miller 2007-10-24 16:25 ` Doug Ledford @ 2007-10-24 20:01 ` Bill Davidsen 2007-10-25 5:43 ` Daniel L. Miller 2007-10-25 6:12 ` Neil Brown 2 siblings, 1 reply; 42+ messages in thread From: Bill Davidsen @ 2007-10-24 20:01 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid Daniel L. Miller wrote: > Daniel L. Miller wrote: >> Richard Scobie wrote: >>> Daniel L. Miller wrote: >>> >>>> And you didn't ask, but my mdadm.conf: >>>> DEVICE partitions >>>> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 >>>> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a >>> >>> Try adding >>> >>> auto=part >>> >>> at the end of you mdadm.conf ARRAY line. >> Thanks - will see what happens on my next reboot. >> > Current mdadm.conf: > DEVICE partitions > ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 > UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part > > still have the problem where on boot one drive is not part of the > array. Is there a log file I can check to find out WHY a drive is not > being added? It's been a while since the reboot, but I did find some > entries in dmesg - I'm appending both the md lines and the physical > disk related lines. The bottom shows one disk not being added (this > time is was sda) - and the disk that gets skipped on each boot seems > to be random - there's no consistent failure: I suspect the base problem is that you are using whole disks instead of partitions, and the problem with the partition table below is probably an indication that you have something on that drive which looks like a partition table but isn't. That prevents the drive from being recognized as a whole drive. You're lucky, if the data looked enough like a partition table to be valid the o/s probably would have tried to do something with it. I can't see any easy (or safe) backout on this, you have used the whole disk, so you can't just drop a drive, partition, and add the partition back in place of the drive. And if you have a failure and ever have to replace a drive, you will have to use a drive or partition at least as large as what you have. Hopefully someone will have a good idea how to gracefully transition to a safer setup, if random data ever looks like a valid partition table, evil may occur. And if you ever get this on two drives at once the system won't boot. Two time-bomb cases, and they're not mutually exclusive. This may be the rare case where you really do need to specify the actual devices to get reliable operation. > > [...] > md: raid10 personality registered for level 10 > [...] > md: Autodetecting RAID arrays. > md: autorun ... > md: ... autorun DONE. > [...] > scsi0 : sata_nv > scsi1 : sata_nv > ata1: SATA max UDMA/133 cmd 0xffffc20001428480 ctl 0xffffc200014284a0 > bmdma 0x0000000000011410 irq 23 > ata2: SATA max UDMA/133 cmd 0xffffc20001428580 ctl 0xffffc200014285a0 > bmdma 0x0000000000011418 irq 23 > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > ata1.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 > ata1.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) > ata1.00: configured for UDMA/133 > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > ata2.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 > ata2.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) > ata2.00: configured for UDMA/133 > scsi 0:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 > ANSI: 5 > ata1: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw > segs 61 > scsi 1:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 > ANSI: 5 > ata2: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw > segs 61 > ACPI: PCI Interrupt Link [LSI1] enabled at IRQ 22 > ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LSI1] -> GSI 22 (level, > high) -> IRQ 22 > sata_nv 0000:00:08.0: Using ADMA mode > PCI: Setting latency timer of device 0000:00:08.0 to 64 > scsi2 : sata_nv > scsi3 : sata_nv > ata3: SATA max UDMA/133 cmd 0xffffc2000142a480 ctl 0xffffc2000142a4a0 > bmdma 0x0000000000011420 irq 22 > ata4: SATA max UDMA/133 cmd 0xffffc2000142a580 ctl 0xffffc2000142a5a0 > bmdma 0x0000000000011428 irq 22 > ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > ata3.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 > ata3.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) > ata3.00: configured for UDMA/133 > ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) > ata4.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 > ata4.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) > ata4.00: configured for UDMA/133 > scsi 2:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 > ANSI: 5 > ata3: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw > segs 61 > scsi 3:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 > ANSI: 5 > ata4: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw > segs 61 > sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) > sd 0:0:0:0: [sda] Write Protect is off > sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 > sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sda: unknown partition table > sd 0:0:0:0: [sda] Attached SCSI disk > sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) > sd 1:0:0:0: [sdb] Write Protect is off > sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) > sd 1:0:0:0: [sdb] Write Protect is off > sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 > sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdb: unknown partition table > sd 1:0:0:0: [sdb] Attached SCSI disk > sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) > sd 2:0:0:0: [sdc] Write Protect is off > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) > sd 2:0:0:0: [sdc] Write Protect is off > sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 > sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdc: unknown partition table > sd 2:0:0:0: [sdc] Attached SCSI disk > sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) > sd 3:0:0:0: [sdd] Write Protect is off > sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) > sd 3:0:0:0: [sdd] Write Protect is off > sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 > sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't > support DPO or FUA > sdd: unknown partition table > sd 3:0:0:0: [sdd] Attached SCSI disk > [...] > md: md0 stopped. > md: md0 stopped. > md: bind<sdc> > md: bind<sdd> > md: bind<sdb> > md: md0: raid array is not clean -- starting background reconstruction > raid10: raid set md0 active with 3 out of 4 devices > md: couldn't update array info. -22 > md: resync of RAID array md0 > md: minimum _guaranteed_ speed: 1000 KB/sec/disk. > md: using maximum available idle IO bandwidth (but not more than > 200000 KB/sec) for resync. > md: using 128k window, over a total of 312581632 blocks. > Filesystem "md0": Disabling barriers, not supported by the underlying > device > XFS mounting filesystem md0 > Starting XFS recovery on filesystem: md0 (logdev: internal) > Ending XFS recovery on filesystem: md0 (logdev: internal) > > > -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-24 20:01 ` Bill Davidsen @ 2007-10-25 5:43 ` Daniel L. Miller 2007-10-25 6:40 ` Doug Ledford 0 siblings, 1 reply; 42+ messages in thread From: Daniel L. Miller @ 2007-10-25 5:43 UTC (permalink / raw) To: linux-raid Bill Davidsen wrote: >>>> Daniel L. Miller wrote: >> Current mdadm.conf: >> DEVICE partitions >> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 >> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part >> >> still have the problem where on boot one drive is not part of the >> array. Is there a log file I can check to find out WHY a drive is >> not being added? It's been a while since the reboot, but I did find >> some entries in dmesg - I'm appending both the md lines and the >> physical disk related lines. The bottom shows one disk not being >> added (this time is was sda) - and the disk that gets skipped on each >> boot seems to be random - there's no consistent failure: > > I suspect the base problem is that you are using whole disks instead > of partitions, and the problem with the partition table below is > probably an indication that you have something on that drive which > looks like a partition table but isn't. That prevents the drive from > being recognized as a whole drive. You're lucky, if the data looked > enough like a partition table to be valid the o/s probably would have > tried to do something with it. > [...] > This may be the rare case where you really do need to specify the > actual devices to get reliable operation. OK - I'm officially confused now (I was just unofficially before). WHY is it a problem using whole drives as RAID components? I would have thought that building a RAID storage unit with identically sized drives - and using each drive's full capacity - is exactly the way you're supposed to! I should mention that the boot/system drive is IDE, and NOT part of the RAID. So I'm not worried about losing the system - but I AM concerned about the data. I'm using four drives in a RAID-10 configuration - I thought this would provide a good blend of safety and performance for a small fileserver. Because it's RAID-10 - I would ASSuME that I can drop one drive (after all, I keep booting one drive short), partition if necessary, and add it back in. But how would splitting these disks into partitions improve either stability or performance? -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 5:43 ` Daniel L. Miller @ 2007-10-25 6:40 ` Doug Ledford 2007-10-26 9:15 ` Luca Berra 2007-10-29 5:59 ` Daniel L. Miller 0 siblings, 2 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-25 6:40 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 5201 bytes --] On Wed, 2007-10-24 at 22:43 -0700, Daniel L. Miller wrote: > Bill Davidsen wrote: > >>>> Daniel L. Miller wrote: > >> Current mdadm.conf: > >> DEVICE partitions > >> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 > >> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part > >> > >> still have the problem where on boot one drive is not part of the > >> array. Is there a log file I can check to find out WHY a drive is > >> not being added? It's been a while since the reboot, but I did find > >> some entries in dmesg - I'm appending both the md lines and the > >> physical disk related lines. The bottom shows one disk not being > >> added (this time is was sda) - and the disk that gets skipped on each > >> boot seems to be random - there's no consistent failure: > > > > I suspect the base problem is that you are using whole disks instead > > of partitions, and the problem with the partition table below is > > probably an indication that you have something on that drive which > > looks like a partition table but isn't. That prevents the drive from > > being recognized as a whole drive. You're lucky, if the data looked > > enough like a partition table to be valid the o/s probably would have > > tried to do something with it. > > [...] > > This may be the rare case where you really do need to specify the > > actual devices to get reliable operation. > OK - I'm officially confused now (I was just unofficially before). WHY > is it a problem using whole drives as RAID components? I would have > thought that building a RAID storage unit with identically sized drives > - and using each drive's full capacity - is exactly the way you're > supposed to! As much as anything else this can be summed up as you are thinking of how you are using the drives and not how unexpected software on your system might try and use your drives. Without a partition table, none of the software on your system can know what to do with the drives except mdadm when it finds an md superblock. That doesn't stop other software from *trying* to find out how to use your drives though. That includes the kernel trying to look for a valid partition table, mount possibly scanning the drive for a file system label, lvm scanning for an lvm superblock, mtools looking for a dos filesystem, etc. Under normal conditions, the random data on your drive will never look valid to these other pieces of software. But, once in a great while, it will look valid. And that's when all hell breaks loose. Or worse, you run a partition program such as fdisk on the device and it initializes the partition table (something that the Fedora/RHEL installers do to all disks without partition tables...well, the installer tells you there's no partition table and asks if you want to initialize it, but if someone is in a hurry and hits yes when they meant no, bye bye data). The partition table is the single, (mostly) universally recognized arbiter of what possible data might be on the disk. Having a partition table may not make mdadm recognize the md superblock any better, but it keeps all that other stuff from even trying to access data that it doesn't have a need to access and prevents random luck from turning your day bad. Oh, and let's not go into what can happen if you're talking about a dual boot machine and what Windows might do to the disk if it doesn't think the disk space is already spoken for by a linux partition. And, in particular with mdadm, I once created a full disk md raid array on a couple disks, then couldn't get things arranged like I wanted, so I just partitioned the disks and then created new arrays in the partitions (without first manually zeroing the superblock for the whole disk array). Since I used a version 1.0 superblock on the whole disk array, and then used version 1.1 superblocks in the partitions, the net result was that when I ran mdadm -Eb, mdadm would find both the 1.1 and 1.0 superblocks in the last partition on the disk. Confused both myself and mdadm for a while. Anyway, I happen to *like* the idea of using full disk devices, but the reality is that the md subsystem doesn't have exclusive ownership of the disks at all times, and without that it really needs to stake a claim on the space instead of leaving things to chance IMO. > I should mention that the boot/system drive is IDE, and > NOT part of the RAID. So I'm not worried about losing the system - but > I AM concerned about the data. I'm using four drives in a RAID-10 > configuration - I thought this would provide a good blend of safety and > performance for a small fileserver. > > Because it's RAID-10 - I would ASSuME that I can drop one drive (after > all, I keep booting one drive short), partition if necessary, and add it > back in. But how would splitting these disks into partitions improve > either stability or performance? -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 6:40 ` Doug Ledford @ 2007-10-26 9:15 ` Luca Berra 2007-10-26 16:53 ` Gabor Gombas 2007-10-26 19:26 ` Doug Ledford 2007-10-29 5:59 ` Daniel L. Miller 1 sibling, 2 replies; 42+ messages in thread From: Luca Berra @ 2007-10-26 9:15 UTC (permalink / raw) To: linux-raid On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: >partition table (something that the Fedora/RHEL installers do to all >disks without partition tables...well, the installer tells you there's >no partition table and asks if you want to initialize it, but if someone >is in a hurry and hits yes when they meant no, bye bye data). Cool feature!!!! > >The partition table is the single, (mostly) universally recognized >arbiter of what possible data might be on the disk. Having a partition >table may not make mdadm recognize the md superblock any better, but it >keeps all that other stuff from even trying to access data that it >doesn't have a need to access and prevents random luck from turning your >day bad. on a pc maybe, but that is 20 years old design. partition table design is limited because it is still based on C/H/S, which do not exist anymore. Put a partition table on a big storage, say a DMX, and enjoy a 20% performance decrease. >Oh, and let's not go into what can happen if you're talking about a dual >boot machine and what Windows might do to the disk if it doesn't think >the disk space is already spoken for by a linux partition. Why the hell should the existance of windows limit the possibility of linux working properly. If i have a pc that dualboots windows i will take care of using the common denominator of a partition table, if it is my big server i will probably not. since it won't boot anything else than Linux. >And, in particular with mdadm, I once created a full disk md raid array >on a couple disks, then couldn't get things arranged like I wanted, so I >just partitioned the disks and then created new arrays in the partitions >(without first manually zeroing the superblock for the whole disk >array). Since I used a version 1.0 superblock on the whole disk array, >and then used version 1.1 superblocks in the partitions, the net result >was that when I ran mdadm -Eb, mdadm would find both the 1.1 and 1.0 >superblocks in the last partition on the disk. Confused both myself and >mdadm for a while. yes, this is fun On the opposite, i once inserted an mmc memory card, which had been initialized on my mobile phone, into the mmc slot of my laptop, and was faced with a load of error about mmcblk0 having an invalid partition table. Obviously it had none, it was a plain fat filesystem. Is the solution partitioning it? I don't think the phone would agree. >Anyway, I happen to *like* the idea of using full disk devices, but the >reality is that the md subsystem doesn't have exclusive ownership of the >disks at all times, and without that it really needs to stake a claim on >the space instead of leaving things to chance IMO. Start removing the partition detection code from the blasted kernel and move it to userspace, which is already in place, but it is not the default. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-26 9:15 ` Luca Berra @ 2007-10-26 16:53 ` Gabor Gombas 2007-10-27 7:57 ` Luca Berra 2007-10-26 19:26 ` Doug Ledford 1 sibling, 1 reply; 42+ messages in thread From: Gabor Gombas @ 2007-10-26 16:53 UTC (permalink / raw) To: linux-raid On Fri, Oct 26, 2007 at 11:15:13AM +0200, Luca Berra wrote: > on a pc maybe, but that is 20 years old design. > partition table design is limited because it is still based on C/H/S, > which do not exist anymore. The MS-DOS format is not the only possible partition table layout. Other formats such as GPT do not have such limitations. > Put a partition table on a big storage, say a DMX, and enjoy a 20% > performance decrease. I assume your "big storage" uses some kind of RAID. Are your partitions stripe-aligned? (Btw. that has nothing to do with partitions, LVM can also suffer if PEs are not aligned). >> Oh, and let's not go into what can happen if you're talking about a dual >> boot machine and what Windows might do to the disk if it doesn't think >> the disk space is already spoken for by a linux partition. > Why the hell should the existance of windows limit the possibility of > linux working properly. Well, if you want to convert a Windows partition to Linux by just changing the partition type, running mke2fs over it, and filling it with data, Windows will happily ignore the partition table change and will overwrite your data without any notice on the next boot (happened with one collegaue, not fun to debug). So much for automatic device type detection... > On the opposite, i once inserted an mmc memory card, which had been > initialized on my mobile phone, into the mmc slot of my laptop, and was > faced with a load of error about mmcblk0 having an invalid partition > table. Obviously it had none, it was a plain fat filesystem. > Is the solution partitioning it? I don't think the phone would > agree. Well, it said it could not find a valid partition change. That was the truth. Why is it a problem if the kernel states a fact? Gabor -- --------------------------------------------------------- MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences --------------------------------------------------------- ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-26 16:53 ` Gabor Gombas @ 2007-10-27 7:57 ` Luca Berra 0 siblings, 0 replies; 42+ messages in thread From: Luca Berra @ 2007-10-27 7:57 UTC (permalink / raw) To: linux-raid On Fri, Oct 26, 2007 at 06:53:40PM +0200, Gabor Gombas wrote: >On Fri, Oct 26, 2007 at 11:15:13AM +0200, Luca Berra wrote: > >> on a pc maybe, but that is 20 years old design. >> partition table design is limited because it is still based on C/H/S, >> which do not exist anymore. > >The MS-DOS format is not the only possible partition table layout. Other >formats such as GPT do not have such limitations. > >> Put a partition table on a big storage, say a DMX, and enjoy a 20% >> performance decrease. > >I assume your "big storage" uses some kind of RAID. Are your partitions >stripe-aligned? (Btw. that has nothing to do with partitions, LVM can >also suffer if PEs are not aligned). mine are, unfortunately the default is to start them at 32256 bytes into the device. >>> Oh, and let's not go into what can happen if you're talking about a dual >>> boot machine and what Windows might do to the disk if it doesn't think >>> the disk space is already spoken for by a linux partition. >> Why the hell should the existance of windows limit the possibility of >> linux working properly. what i am saying is that a dual boot machine is not the only scenario we have. >> On the opposite, i once inserted an mmc memory card, which had been >> initialized on my mobile phone, into the mmc slot of my laptop, and was >> faced with a load of error about mmcblk0 having an invalid partition >> table. Obviously it had none, it was a plain fat filesystem. >> Is the solution partitioning it? I don't think the phone would >> agree. > >Well, it said it could not find a valid partition change. That was the >truth. Why is it a problem if the kernel states a fact? it is random. reformatting it made the kernel message go away. i wonder if by chance something would decide it is a valid partition table.... -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-26 9:15 ` Luca Berra 2007-10-26 16:53 ` Gabor Gombas @ 2007-10-26 19:26 ` Doug Ledford 2007-10-27 7:50 ` Luca Berra 2007-10-29 0:21 ` Bill Davidsen 1 sibling, 2 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-26 19:26 UTC (permalink / raw) To: Luca Berra; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 4002 bytes --] On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote: > On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: > >The partition table is the single, (mostly) universally recognized > >arbiter of what possible data might be on the disk. Having a partition > >table may not make mdadm recognize the md superblock any better, but it > >keeps all that other stuff from even trying to access data that it > >doesn't have a need to access and prevents random luck from turning your > >day bad. > on a pc maybe, but that is 20 years old design. So? Unix is 35+ year old design, I suppose you want to switch to Vista then? > partition table design is limited because it is still based on C/H/S, > which do not exist anymore. > Put a partition table on a big storage, say a DMX, and enjoy a 20% > performance decrease. Because you didn't stripe align the partition, your bad. > >Oh, and let's not go into what can happen if you're talking about a dual > >boot machine and what Windows might do to the disk if it doesn't think > >the disk space is already spoken for by a linux partition. > Why the hell should the existance of windows limit the possibility of > linux working properly. Linux works properly with a partition table, so this is a specious statement. > If i have a pc that dualboots windows i will take care of using the > common denominator of a partition table, if it is my big server i will > probably not. since it won't boot anything else than Linux. Doesn't really gain you anything, but your choice. Besides, the question wasn't "why shouldn't Luca Berra use whole disk devices", it was why I don't recommend using whole disk devices, and my recommendation wasn't based in the least bit upon a single person's use scenario. > >And, in particular with mdadm, I once created a full disk md raid array > >on a couple disks, then couldn't get things arranged like I wanted, so I > >just partitioned the disks and then created new arrays in the partitions > >(without first manually zeroing the superblock for the whole disk > >array). Since I used a version 1.0 superblock on the whole disk array, > >and then used version 1.1 superblocks in the partitions, the net result > >was that when I ran mdadm -Eb, mdadm would find both the 1.1 and 1.0 > >superblocks in the last partition on the disk. Confused both myself and > >mdadm for a while. > yes, this is fun > On the opposite, i once inserted an mmc memory card, which had been > initialized on my mobile phone, into the mmc slot of my laptop, and was > faced with a load of error about mmcblk0 having an invalid partition > table. So? The messages are just informative, feel free to ignore them. > Obviously it had none, it was a plain fat filesystem. > Is the solution partitioning it? I don't think the phone would > agree. The phone dictates the format, only a moron would say otherwise. But, then again, the phone doesn't care about interoperability and many other issues on memory cards that it thinks it owns, so only a moron would argue that because a phone doesn't use a partition table that nothing else in the computer realm needs to either. > >Anyway, I happen to *like* the idea of using full disk devices, but the > >reality is that the md subsystem doesn't have exclusive ownership of the > >disks at all times, and without that it really needs to stake a claim on > >the space instead of leaving things to chance IMO. > Start removing the partition detection code from the blasted kernel and > move it to userspace, which is already in place, but it is not the > default. Which just moves where the work is done, not what work needs to be done. It's a change for no benefit and a waste of time. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-26 19:26 ` Doug Ledford @ 2007-10-27 7:50 ` Luca Berra 2007-10-27 15:07 ` Gabor Gombas 2007-10-27 20:47 ` Doug Ledford 2007-10-29 0:21 ` Bill Davidsen 1 sibling, 2 replies; 42+ messages in thread From: Luca Berra @ 2007-10-27 7:50 UTC (permalink / raw) To: linux-raid On Fri, Oct 26, 2007 at 03:26:33PM -0400, Doug Ledford wrote: >On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote: >> On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: >> >The partition table is the single, (mostly) universally recognized >> >arbiter of what possible data might be on the disk. Having a partition >> >table may not make mdadm recognize the md superblock any better, but it >> >keeps all that other stuff from even trying to access data that it >> >doesn't have a need to access and prevents random luck from turning your >> >day bad. >> on a pc maybe, but that is 20 years old design. > >So? Unix is 35+ year old design, I suppose you want to switch to Vista >then? unix is a 35+ year old design that evolved in time, some ideas were kept, some ditched. >> partition table design is limited because it is still based on C/H/S, >> which do not exist anymore. >> Put a partition table on a big storage, say a DMX, and enjoy a 20% >> performance decrease. > >Because you didn't stripe align the partition, your bad. :) by default fdisk misalignes partition tables and aligning them is more complex than just doing without. >> >Oh, and let's not go into what can happen if you're talking about a dual >> >boot machine and what Windows might do to the disk if it doesn't think >> >the disk space is already spoken for by a linux partition. >> Why the hell should the existance of windows limit the possibility of >> linux working properly. > >Linux works properly with a partition table, so this is a specious >statement. It should also work properly without one. >> If i have a pc that dualboots windows i will take care of using the >> common denominator of a partition table, if it is my big server i will >> probably not. since it won't boot anything else than Linux. > >Doesn't really gain you anything, but your choice. Besides, the >question wasn't "why shouldn't Luca Berra use whole disk devices", it >was why I don't recommend using whole disk devices, and my >recommendation wasn't based in the least bit upon a single person's use >scenario. If i am the only person in the world that believes partition tables should not be required then i'll shut up. >> On the opposite, i once inserted an mmc memory card, which had been >> initialized on my mobile phone, into the mmc slot of my laptop, and was >> faced with a load of error about mmcblk0 having an invalid partition >> table. > >So? The messages are just informative, feel free to ignore them. but did not anaconda propose to wipe unpartitioned disks? >The phone dictates the format, only a moron would say otherwise. But, >then again, the phone doesn't care about interoperability and many other >issues on memory cards that it thinks it owns, so only a moron would >argue that because a phone doesn't use a partition table that nothing >else in the computer realm needs to either. i don't count myself as a moron, what i am trying to say is that partition tables are one way of organizing disk space, not the only one. >> >Anyway, I happen to *like* the idea of using full disk devices, but the >> >reality is that the md subsystem doesn't have exclusive ownership of the >> >disks at all times, and without that it really needs to stake a claim on >> >the space instead of leaving things to chance IMO. >> Start removing the partition detection code from the blasted kernel and >> move it to userspace, which is already in place, but it is not the >> default. > >Which just moves where the work is done, not what work needs to be done. and also permits to decide if it hat to be done or not. >It's a change for no benefit and a waste of time. the waste of time was having to put code in mdadm to undo partition detection on component devices, where partition detection should not have taken place. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-27 7:50 ` Luca Berra @ 2007-10-27 15:07 ` Gabor Gombas 2007-10-27 20:47 ` Doug Ledford 1 sibling, 0 replies; 42+ messages in thread From: Gabor Gombas @ 2007-10-27 15:07 UTC (permalink / raw) To: linux-raid On Sat, Oct 27, 2007 at 09:50:55AM +0200, Luca Berra wrote: >> Because you didn't stripe align the partition, your bad. > :) > by default fdisk misalignes partition tables > and aligning them is more complex than just doing without. Why use fdisk then? Use parted instead. It's not the kernel's fault if you use tools not suited for a given task... >> Linux works properly with a partition table, so this is a specious >> statement. > It should also work properly without one. It does: sd 0:0:2:0: [sdc] Very big device. Trying to use READ CAPACITY(16). sd 0:0:2:0: [sdc] 7812333568 512-byte hardware sectors (3999915 MB) sd 0:0:2:0: [sdc] Write Protect is off sd 0:0:2:0: [sdc] Mode Sense: 23 00 00 00 sd 0:0:2:0: [sdc] Write cache: enabled, read cache: disabled, doesn't support DPO or FUA sdc: unknown partition table Works perfectly without any partition tables... You seem to be annoyed because the kernel tells you that there is no partition table it recognizes - but if that bothers you so, simply stop reading the kernel logs. My kernel also tells me that it failed to find an AGP bridge - by your logic that should mean that everyone still using AGP-capable motherboards should toss their system to the junkyard?!? Gabor -- --------------------------------------------------------- MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences --------------------------------------------------------- ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-27 7:50 ` Luca Berra 2007-10-27 15:07 ` Gabor Gombas @ 2007-10-27 20:47 ` Doug Ledford 2007-10-28 13:37 ` Luca Berra 1 sibling, 1 reply; 42+ messages in thread From: Doug Ledford @ 2007-10-27 20:47 UTC (permalink / raw) To: Luca Berra; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 6174 bytes --] On Sat, 2007-10-27 at 09:50 +0200, Luca Berra wrote: > On Fri, Oct 26, 2007 at 03:26:33PM -0400, Doug Ledford wrote: > >On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote: > >> On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: > >> >The partition table is the single, (mostly) universally recognized > >> >arbiter of what possible data might be on the disk. Having a partition > >> >table may not make mdadm recognize the md superblock any better, but it > >> >keeps all that other stuff from even trying to access data that it > >> >doesn't have a need to access and prevents random luck from turning your > >> >day bad. > >> on a pc maybe, but that is 20 years old design. > > > >So? Unix is 35+ year old design, I suppose you want to switch to Vista > >then? > unix is a 35+ year old design that evolved in time, some ideas were > kept, some ditched. BSD disk labels are still in use, SunOS disk labels are still in use, partition tables are somewhat on the way out, but only because they are being replaced by the new EFI disk partitioning method. The only place where partitionless devices is common is in dedicated raid boxes where the raid controller is the only thing that will *ever* see that disk. Sometimes they do it on big SAN/NAS stuff because they don't want to align the partition table to the underlying device's stripe layout, but even then they do so in a tightly controlled environment where they know exactly which machines will be allowed to even try and access the device. > >> partition table design is limited because it is still based on C/H/S, > >> which do not exist anymore. > >> Put a partition table on a big storage, say a DMX, and enjoy a 20% > >> performance decrease. > > > >Because you didn't stripe align the partition, your bad. > :) > by default fdisk misalignes partition tables > and aligning them is more complex than just doing without. So. You really need to take the time and to understand the alignment of the device because then and only then can you pass options to mke2fs to align the fs metadata with the stripes as well thereby buying you ever more performance than just leaving off the partition table (assuming that's what you use, I don't know if other mkfs programs have the same options for aligning metadata with stripes). And if you take the time to understand the underlying stripe layout for the mkfs stuff, then you can use the same information to align the partition table. > >> >Oh, and let's not go into what can happen if you're talking about a dual > >> >boot machine and what Windows might do to the disk if it doesn't think > >> >the disk space is already spoken for by a linux partition. > >> Why the hell should the existance of windows limit the possibility of > >> linux working properly. > > > >Linux works properly with a partition table, so this is a specious > >statement. > It should also work properly without one. Most of the time it does. But those times where it can fail, the failure is due to not taking the precautions necessary to prevent it: aka labeling disk usage via some sort of partition table/disklabel/etc. > >> If i have a pc that dualboots windows i will take care of using the > >> common denominator of a partition table, if it is my big server i will > >> probably not. since it won't boot anything else than Linux. > > > >Doesn't really gain you anything, but your choice. Besides, the > >question wasn't "why shouldn't Luca Berra use whole disk devices", it > >was why I don't recommend using whole disk devices, and my > >recommendation wasn't based in the least bit upon a single person's use > >scenario. > If i am the only person in the world that believes partition tables > should not be required then i'll shut up. > > >> On the opposite, i once inserted an mmc memory card, which had been > >> initialized on my mobile phone, into the mmc slot of my laptop, and was > >> faced with a load of error about mmcblk0 having an invalid partition > >> table. > > > >So? The messages are just informative, feel free to ignore them. > but did not anaconda propose to wipe unpartitioned disks? Did you stick your mmc card in there during the install of the OS? That's the only time anaconda ever runs, and therefore the only time it ever checks your devices. It makes sense that during the initial install, when the OS is only configured to see locally connected devices, or possibly iSCSI devices that you have specifically told it to probe, that it would then ask you the question about those devices. Other network attached or shared devices are generally added after the initial install. > >The phone dictates the format, only a moron would say otherwise. But, > >then again, the phone doesn't care about interoperability and many other > >issues on memory cards that it thinks it owns, so only a moron would > >argue that because a phone doesn't use a partition table that nothing > >else in the computer realm needs to either. > i don't count myself as a moron, what i am trying to say is that > partition tables are one way of organizing disk space, not the only one. Using whole disk devices isn't a means of organizing space. It's a way to get a rather miniscule amount of space back by *not* organizing the space. This whole argument seems to boil down to you wanting to perfectly optimize your system for your use case which includes controlling the environment enough that you know it's safe to not partition your disks, where as I argue that although this works in controlled environments, it is known to have failure modes in other environments, and I would be totally remiss if I recommended to my customers that they should take the risk that you can ignore because of your controlled environment since I know a lot of my customers *don't* have a controlled environment such as you do. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-27 20:47 ` Doug Ledford @ 2007-10-28 13:37 ` Luca Berra 2007-10-28 17:55 ` Doug Ledford 0 siblings, 1 reply; 42+ messages in thread From: Luca Berra @ 2007-10-28 13:37 UTC (permalink / raw) To: linux-raid On Sat, Oct 27, 2007 at 04:47:30PM -0400, Doug Ledford wrote: >On Sat, 2007-10-27 at 09:50 +0200, Luca Berra wrote: >> On Fri, Oct 26, 2007 at 03:26:33PM -0400, Doug Ledford wrote: >> >On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote: >> >> On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: >> >> >The partition table is the single, (mostly) universally recognized >> >> >arbiter of what possible data might be on the disk. Having a partition >> >> >table may not make mdadm recognize the md superblock any better, but it >> >> >keeps all that other stuff from even trying to access data that it >> >> >doesn't have a need to access and prevents random luck from turning your >> >> >day bad. >> >> on a pc maybe, but that is 20 years old design. >> > >> >So? Unix is 35+ year old design, I suppose you want to switch to Vista >> >then? >> unix is a 35+ year old design that evolved in time, some ideas were >> kept, some ditched. > >BSD disk labels are still in use, SunOS disk labels are still in use, i am not a solaris expert, do they still use disk labels under vxvm? oh, by the way, disklabels do not support the partition type attribute. >partition tables are somewhat on the way out, but only because they are >being replaced by the new EFI disk partitioning method. The only place >where partitionless devices is common is in dedicated raid boxes where >the raid controller is the only thing that will *ever* see that disk. well i am more used to other os (HP, AIX) where lvm is the common mean of accessing disk devices .... >> by default fdisk misalignes partition tables >> and aligning them is more complex than just doing without. > >So. You really need to take the time and to understand the alignment of >the device because then and only then can you pass options to mke2fs to yes and i am not the only person in the world doing that. >> >Linux works properly with a partition table, so this is a specious >> >statement. >> It should also work properly without one. > >Most of the time it does. But those times where it can fail, the >failure is due to not taking the precautions necessary to prevent it: >aka labeling disk usage via some sort of partition table/disklabel/etc. I strongly disagree. the failure is badly designed software. >Did you stick your mmc card in there during the install of the OS? My laptop has a built-in mmc slot, so i sometimes leave a card plugged in. But the mmc thing was just an example, it is not that critical. >> i don't count myself as a moron, what i am trying to say is that >> partition tables are one way of organizing disk space, not the only one. > >Using whole disk devices isn't a means of organizing space. It's a way >to get a rather miniscule amount of space back by *not* organizing the >space. if i am using, say lvm to organize disk space, a partition table is unnecessary to the organization, and it is natural not using them. >This whole argument seems to boil down to you wanting to perfectly >optimize your system for your use case which includes controlling the >environment enough that you know it's safe to not partition your disks, >where as I argue that although this works in controlled environments, it >is known to have failure modes in other environments, and I would be >totally remiss if I recommended to my customers that they should take >the risk that you can ignore because of your controlled environment >since I know a lot of my customers *don't* have a controlled environment >such as you do. The whole argument to me boils down to the fact that not having a partition table on a device is possible, and software that do not consider this eventuality is flawed, and recommnding to work-around flawed software is just digging your head in the sand. But i believe i did not convince you one ounce more than you convinced me, so i'll quit this thread which is getting too far. Regards, L. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-28 13:37 ` Luca Berra @ 2007-10-28 17:55 ` Doug Ledford 0 siblings, 0 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-28 17:55 UTC (permalink / raw) To: Luca Berra; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 3249 bytes --] On Sun, 2007-10-28 at 14:37 +0100, Luca Berra wrote: > On Sat, Oct 27, 2007 at 04:47:30PM -0400, Doug Ledford wrote: > >Most of the time it does. But those times where it can fail, the > >failure is due to not taking the precautions necessary to prevent it: > >aka labeling disk usage via some sort of partition table/disklabel/etc. > I strongly disagree. > the failure is badly designed software. Then you need to blame Ingo who made putting the superblock at the end of the device the standard. If the superblock were always at the beginning, then this whole argument would be moot. Things would be reliable the way you want. > >Using whole disk devices isn't a means of organizing space. It's a way > >to get a rather miniscule amount of space back by *not* organizing the > >space. > if i am using, say lvm to organize disk space, a partition table is > unnecessary to the organization, and it is natural not using them. If you are using straight lvm then you don't have this problem anyway. Lvm doesn't allow the underlying physical device to *look* like a valid, partitioned, single device. Md does when the superblock is at the end. > >This whole argument seems to boil down to you wanting to perfectly > >optimize your system for your use case which includes controlling the > >environment enough that you know it's safe to not partition your disks, > >where as I argue that although this works in controlled environments, it > >is known to have failure modes in other environments, and I would be > >totally remiss if I recommended to my customers that they should take > >the risk that you can ignore because of your controlled environment > >since I know a lot of my customers *don't* have a controlled environment > >such as you do. > > The whole argument to me boils down to the fact that not having a partition > table on a device is possible, and software that do not consider this > eventuality is flawed, It's simply not possible to 100% certain differentiate between an md whole disk partitioned device with a superblock at the end and a regular device. Period. You can try to be clever, but you can also get tripped up. The flaw is not with the software, it's with a design that allowed this to happen. > and recommnding to work-around flawed software is > just digging your head in the sand. If a design is broken but in place, I have no choice but to work around it. Anything else is just stupid. > But i believe i did not convince you one ounce more than you convinced > me, so i'll quit this thread which is getting too far. > > Regards, > L. > > -- > Luca Berra -- bluca@comedia.it > Communication Media & Services S.r.l. > /"\ > \ / ASCII RIBBON CAMPAIGN > X AGAINST HTML MAIL > / \ > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-26 19:26 ` Doug Ledford 2007-10-27 7:50 ` Luca Berra @ 2007-10-29 0:21 ` Bill Davidsen 2007-10-29 7:41 ` Luca Berra 2007-10-29 14:31 ` Doug Ledford 1 sibling, 2 replies; 42+ messages in thread From: Bill Davidsen @ 2007-10-29 0:21 UTC (permalink / raw) To: Doug Ledford; +Cc: Luca Berra, linux-raid Doug Ledford wrote: > On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote: > >> On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: >> >>> The partition table is the single, (mostly) universally recognized >>> arbiter of what possible data might be on the disk. Having a partition >>> table may not make mdadm recognize the md superblock any better, but it >>> keeps all that other stuff from even trying to access data that it >>> doesn't have a need to access and prevents random luck from turning your >>> day bad. >>> >> on a pc maybe, but that is 20 years old design. >> > > So? Unix is 35+ year old design, I suppose you want to switch to Vista > then? > > >> partition table design is limited because it is still based on C/H/S, >> which do not exist anymore. >> Put a partition table on a big storage, say a DMX, and enjoy a 20% >> performance decrease. >> > > Because you didn't stripe align the partition, your bad. > Align to /what/ stripe? Hardware (CHS is fiction), software (of the RAID you're about to create), or ??? I don't notice my FC6 or FC7 install programs using any special partition location to start, I have only run (tried to run) FC8-test3 for the live CD, so I can't say what it might do. CentOS4 didn't do anything obvious, either, so unless I really misunderstand your position at redhat, that would be your bad. ;-) If you mean start a partition on a pseudo-CHS boundary, fdisk seems to use what it thinks are cylinders for that. Please clarify what alignment provides a performance benefit. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 0:21 ` Bill Davidsen @ 2007-10-29 7:41 ` Luca Berra 2007-10-29 13:22 ` Bill Davidsen 2007-10-29 15:54 ` Gabor Gombas 2007-10-29 14:31 ` Doug Ledford 1 sibling, 2 replies; 42+ messages in thread From: Luca Berra @ 2007-10-29 7:41 UTC (permalink / raw) To: linux-raid On Sun, Oct 28, 2007 at 08:21:34PM -0400, Bill Davidsen wrote: >>Because you didn't stripe align the partition, your bad. >> >Align to /what/ stripe? Hardware (CHS is fiction), software (of the RAID the real stripe (track) size of the storage, you must read the manual and/or bug technical support for that info. >you're about to create), or ??? I don't notice my FC6 or FC7 install >programs using any special partition location to start, I have only run >(tried to run) FC8-test3 for the live CD, so I can't say what it might >do. CentOS4 didn't do anything obvious, either, so unless I really >misunderstand your position at redhat, that would be your bad. ;-) > >If you mean start a partition on a pseudo-CHS boundary, fdisk seems to >use what it thinks are cylinders for that. Yes, fdisk will create partition at sector 63 (due to CHS being braindead, other than fictional: 63 sectors-per-track) most arrays use 64 or 128 spt, and array cache are aligned accordingly. So 63 is almost always the wrong choice. for the default choice you must consider what spt your array uses, iirc (this is from memory, so double check these figures) IBM 64 spt (i think) EMC DMX 64 EMC CX 128??? HDS (and HP XP) except OPEN-V 96 HDS (and HP XP) OPEN-V 128 HP EVA 4/6/8 with XCS 5.x state that no alignment is needed even if i never found a technical explanation about that. previous HP EVA versions did (maybe 64). you might then want to consider how data is laid out on the storage, but i believe the storage cache is enough to deal with that issue. Please note that "0" is always well aligned. Note to people who is now wondering WTH i am talking about. consider a storage with 64 spt, an io size of 4k and partition starting at sector 63. first io request will require two ios from the storage (1 for sector 63, and one for sectors 64 to 70) the next 7 io (71-78,79-86,97-94,95-102,103-110,111-118,119-126) will be on the same track the 8th will again require to be split, and so on. this causes the storage to do 1 unnecessary io every 8. YMMV. L. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 7:41 ` Luca Berra @ 2007-10-29 13:22 ` Bill Davidsen 2007-10-29 15:21 ` Doug Ledford 2007-10-29 15:54 ` Gabor Gombas 1 sibling, 1 reply; 42+ messages in thread From: Bill Davidsen @ 2007-10-29 13:22 UTC (permalink / raw) To: linux-raid Luca Berra wrote: > On Sun, Oct 28, 2007 at 08:21:34PM -0400, Bill Davidsen wrote: >>> Because you didn't stripe align the partition, your bad. >>> >> Align to /what/ stripe? Hardware (CHS is fiction), software (of the RAID > the real stripe (track) size of the storage, you must read the manual > and/or bug technical support for that info. That's my point, there *is* no "real stripe (track) size of the storage" because modern drives use zone bit recording, and sectors per track depends on track, and changes within a partition. See http://www.dewassoc.com/kbase/hard_drives/hard_disk_sector_structures.htm http://www.storagereview.com/guide2000/ref/hdd/op/mediaTracks.html >> you're about to create), or ??? I don't notice my FC6 or FC7 install >> programs using any special partition location to start, I have only >> run (tried to run) FC8-test3 for the live CD, so I can't say what it >> might do. CentOS4 didn't do anything obvious, either, so unless I >> really misunderstand your position at redhat, that would be your >> bad. ;-) >> >> If you mean start a partition on a pseudo-CHS boundary, fdisk seems >> to use what it thinks are cylinders for that. > Yes, fdisk will create partition at sector 63 (due to CHS being > braindead, > other than fictional: 63 sectors-per-track) > most arrays use 64 or 128 spt, and array cache are aligned accordingly. > So 63 is almost always the wrong choice. As the above links show, there's no right choice. > > for the default choice you must consider what spt your array uses, iirc > (this is from memory, so double check these figures) > IBM 64 spt (i think) > EMC DMX 64 > EMC CX 128??? > HDS (and HP XP) except OPEN-V 96 > HDS (and HP XP) OPEN-V 128 > HP EVA 4/6/8 with XCS 5.x state that no alignment is needed even if i > never found a technical explanation about that. > previous HP EVA versions did (maybe 64). > you might then want to consider how data is laid out on the storage, but > i believe the storage cache is enough to deal with that issue. > > Please note that "0" is always well aligned. > > Note to people who is now wondering WTH i am talking about. > > consider a storage with 64 spt, an io size of 4k and partition starting > at sector 63. > first io request will require two ios from the storage (1 for sector 63, > and one for sectors 64 to 70) > the next 7 io (71-78,79-86,97-94,95-102,103-110,111-118,119-126) will be > on the same track > the 8th will again require to be split, and so on. > this causes the storage to do 1 unnecessary io every 8. YMMV. No one makes drives with fixed spt any more. Your assumptions are a decade out of date. -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 13:22 ` Bill Davidsen @ 2007-10-29 15:21 ` Doug Ledford 0 siblings, 0 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-29 15:21 UTC (permalink / raw) To: Bill Davidsen; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1121 bytes --] On Mon, 2007-10-29 at 09:22 -0400, Bill Davidsen wrote: > > consider a storage with 64 spt, an io size of 4k and partition starting > > at sector 63. > > first io request will require two ios from the storage (1 for sector 63, > > and one for sectors 64 to 70) > > the next 7 io (71-78,79-86,97-94,95-102,103-110,111-118,119-126) will be > > on the same track > > the 8th will again require to be split, and so on. > > this causes the storage to do 1 unnecessary io every 8. YMMV. > No one makes drives with fixed spt any more. Your assumptions are a > decade out of date. Your missing the point, it's not about drive tracks, it's about array tracks, aka chunks. A 64k write, that should write to one and only one chunk, ends up spanning two. That increases the amount of writing the array has to do and the number of disks it busies for a typical single I/O operation. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 7:41 ` Luca Berra 2007-10-29 13:22 ` Bill Davidsen @ 2007-10-29 15:54 ` Gabor Gombas 1 sibling, 0 replies; 42+ messages in thread From: Gabor Gombas @ 2007-10-29 15:54 UTC (permalink / raw) To: linux-raid On Mon, Oct 29, 2007 at 08:41:39AM +0100, Luca Berra wrote: > consider a storage with 64 spt, an io size of 4k and partition starting > at sector 63. > first io request will require two ios from the storage (1 for sector 63, > and one for sectors 64 to 70) > the next 7 io (71-78,79-86,97-94,95-102,103-110,111-118,119-126) will be > on the same track > the 8th will again require to be split, and so on. > this causes the storage to do 1 unnecessary io every 8. YMMV. That's only true for random reads. If the OS does sufficient read-ahead then sequential reads are affected much less. But the killers are the misaligned random writes since then (considering RAID5/6 for simplicity) the stripe has to be read from all component disks before it can be written back. Gabor -- --------------------------------------------------------- MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences --------------------------------------------------------- ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 0:21 ` Bill Davidsen 2007-10-29 7:41 ` Luca Berra @ 2007-10-29 14:31 ` Doug Ledford 1 sibling, 0 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-29 14:31 UTC (permalink / raw) To: Bill Davidsen; +Cc: Luca Berra, linux-raid [-- Attachment #1: Type: text/plain, Size: 2633 bytes --] On Sun, 2007-10-28 at 20:21 -0400, Bill Davidsen wrote: > Doug Ledford wrote: > > On Fri, 2007-10-26 at 11:15 +0200, Luca Berra wrote: > > > >> On Thu, Oct 25, 2007 at 02:40:06AM -0400, Doug Ledford wrote: > >> > >>> The partition table is the single, (mostly) universally recognized > >>> arbiter of what possible data might be on the disk. Having a partition > >>> table may not make mdadm recognize the md superblock any better, but it > >>> keeps all that other stuff from even trying to access data that it > >>> doesn't have a need to access and prevents random luck from turning your > >>> day bad. > >>> > >> on a pc maybe, but that is 20 years old design. > >> > > > > So? Unix is 35+ year old design, I suppose you want to switch to Vista > > then? > > > > > >> partition table design is limited because it is still based on C/H/S, > >> which do not exist anymore. > >> Put a partition table on a big storage, say a DMX, and enjoy a 20% > >> performance decrease. > >> > > > > Because you didn't stripe align the partition, your bad. > > > Align to /what/ stripe? Hardware (CHS is fiction), software (of the RAID > you're about to create), or ??? I don't notice my FC6 or FC7 install > programs using any special partition location to start, I have only run > (tried to run) FC8-test3 for the live CD, so I can't say what it might > do. CentOS4 didn't do anything obvious, either, so unless I really > misunderstand your position at redhat, that would be your bad. ;-) > > If you mean start a partition on a pseudo-CHS boundary, fdisk seems to > use what it thinks are cylinders for that. > > Please clarify what alignment provides a performance benefit. Luca was specifically talking about the big multi-terabyte to petabyte hardware arrays on the market. DMX, DDN, and others. When they export a volume to the OS, there is an underlying stripe layout to that volume. If you don't use any partition table at all, you are automatically aligned with their stripes. However, if you do, then you have to align your partition on a chunk boundary or else performance drops pretty dramatically as a result of more writes than not crossing chunk boundaries unnecessarily. It's only relevant when you are talking about a raid device that shows the OS a single logical disk made from lots of other disks. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 6:40 ` Doug Ledford 2007-10-26 9:15 ` Luca Berra @ 2007-10-29 5:59 ` Daniel L. Miller 2007-10-29 8:18 ` Luca Berra ` (2 more replies) 1 sibling, 3 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-10-29 5:59 UTC (permalink / raw) To: linux-raid Doug Ledford wrote: > Anyway, I happen to *like* the idea of using full disk devices, but the > reality is that the md subsystem doesn't have exclusive ownership of the > disks at all times, and without that it really needs to stake a claim on > the space instead of leaving things to chance IMO. > I've been re-reading this post numerous times - trying to ignore the burgeoning flame war :) - and this last sentence finally clicked with me. As I'm a novice Linux user - and not involved in development at all - bear with me if I'm stating something obvious. And if I'm wrong - please be gentle! 1. md devices are not "native" to the kernel - they are created/assembled/activated/whatever by a userspace program. 2. Because md devices are "non-native" devices, and are composed of "native" devices, the kernel may try to use those components directly without going through md. 3. Creating a partition table somehow (I'm still not clear how/why) reduces the chance the kernel will access the drive directly without md. These concepts suddenly have me terrified over my data integrity. Is the md system so delicate that BOOT sequence can corrupt it? How is it more reliable AFTER the completed boot sequence? Nothing in the documentation (that I read - granted I don't always read everything) stated that partitioning prior to md creation was necessary - in fact references were provided on how to use complete disks. Is there an "official" position on, "To Partition, or Not To Partition"? Particularly for my application - dedicated Linux server, RAID-10 configuration, identical drives. And if partitioning is the answer - what do I need to do with my live dataset? Drop one drive, partition, then add the partition as a new drive to the set - and repeat for each drive after the rebuild finishes? -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 5:59 ` Daniel L. Miller @ 2007-10-29 8:18 ` Luca Berra 2007-10-29 15:47 ` Doug Ledford 2007-10-29 17:08 ` Doug Ledford 2007-10-29 18:56 ` Richard Scobie 2 siblings, 1 reply; 42+ messages in thread From: Luca Berra @ 2007-10-29 8:18 UTC (permalink / raw) To: linux-raid On Sun, Oct 28, 2007 at 10:59:01PM -0700, Daniel L. Miller wrote: >Doug Ledford wrote: >>Anyway, I happen to *like* the idea of using full disk devices, but the >>reality is that the md subsystem doesn't have exclusive ownership of the >>disks at all times, and without that it really needs to stake a claim on >>the space instead of leaving things to chance IMO. >> >I've been re-reading this post numerous times - trying to ignore the >burgeoning flame war :) - and this last sentence finally clicked with me. > I am sorry Daniel, when i read Doug and Bill, stating that your issue was not having a partition table, i immediately took the bait and forgot about your original issue. I have no reason to believe your problem is due to not having a partition table on your devices. .... sda: unknown partition table .... sdb: unknown partition table .... sdc: unknown partition table .... sdd: unknown partition table the above clearly shows that the kernel does not see a partition table where there is none which happens in some cases and bit Doug so hard. Note, it does not happen at random, it should happen only if you use a partitioned md device with a superblock at the end. Or if you configure it wrongly as Doug did. (i am not accusing Doug of being stupid at all, it is a fairly common mistake to make and we should try to prevent this in mdadm as much as we can) Again, having the kernel find a partition table where there is none, should not pose a problem at all unless there is some badly designed software like udev/hal that believes it knows better than you about what you have on your disks. but _NEITHER OF THESE IS YOUR PROBLEM_ imho I am also sorry to say that i fail to identify what the source of your problem is, we should try harder instead of flaming between us. Is it possible to reproduce it on the live system e.g. unmount, stop array, start it again and mount. I bet it will work flawlessly in this case. then i would disable starting this array at boot, and start it manually when the system is up (stracing mdadm, so we can see what it does) I am also wondering about this: md: md0: raid array is not clean -- starting background reconstruction does your system shut down properly? do you see the message about stopping md at the very end of the reboot/halt process? L. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 8:18 ` Luca Berra @ 2007-10-29 15:47 ` Doug Ledford 2007-10-29 21:29 ` Luca Berra 0 siblings, 1 reply; 42+ messages in thread From: Doug Ledford @ 2007-10-29 15:47 UTC (permalink / raw) To: Luca Berra; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 4533 bytes --] On Mon, 2007-10-29 at 09:18 +0100, Luca Berra wrote: > On Sun, Oct 28, 2007 at 10:59:01PM -0700, Daniel L. Miller wrote: > >Doug Ledford wrote: > >>Anyway, I happen to *like* the idea of using full disk devices, but the > >>reality is that the md subsystem doesn't have exclusive ownership of the > >>disks at all times, and without that it really needs to stake a claim on > >>the space instead of leaving things to chance IMO. > >> > >I've been re-reading this post numerous times - trying to ignore the > >burgeoning flame war :) - and this last sentence finally clicked with me. > > > I am sorry Daniel, when i read Doug and Bill, stating that your issue > was not having a partition table, i immediately took the bait and forgot > about your original issue. I never said *his* issue was lack of partition table, I just said I don't recommend that because it's flaky. The last statement I made about his issue was to ask about whether the problem was happening during initrd time or sysinit time to try and identify if it was failing before or after / was mounted to try and determine where the issue might lay. Then we got off on the tangent about partitions, and at the same time Neil started asking about udev, at which point it came out that he's running ubuntu, and as much as I would like to help, the fact of the matter is that I've never touched ubuntu and wouldn't have the faintest clue, so I let Neil handle it. At which point he found that the udev scripts in ubuntu are being stupid, and from the looks of it are the cause of the problem. So, I've considered the initial issue root caused for a bit now. > like udev/hal that believes it knows better than you about what you have > on your disks. > but _NEITHER OF THESE IS YOUR PROBLEM_ imho Actually, it looks like udev *is* the problem, but not because of partition tables. > I am also sorry to say that i fail to identify what the source of your > problem is, we should try harder instead of flaming between us. We can do both, or at least I can :-P > Is it possible to reproduce it on the live system > e.g. unmount, stop array, start it again and mount. > I bet it will work flawlessly in this case. > then i would disable starting this array at boot, and start it manually > when the system is up (stracing mdadm, so we can see what it does) > > I am also wondering about this: > md: md0: raid array is not clean -- starting background reconstruction > does your system shut down properly? > do you see the message about stopping md at the very end of the > reboot/halt process? The root cause is that as udev adds his sata devices one at a time, on each add of the sata device it invokes mdadm to see if there is an array to start, and it doesn't use incremental mode on mdadm. As a result, as soon as there are 3 out of the 4 disks present, mdadm starts the array in degraded mode. It's probably a race between the mdadm started on the third disk and mdadm started on the fourth disk that results in the message about being unable to set the array info. The one loosing the race gets the error as the other one has already manipulated the array (for example, the 4th disk mdadm could be trying to add the first disk to the array, but it's already there, so it gets this error and bails). So, as much as you might dislike mkinitrd since 5.0 Luca, it doesn't have this particular problem ;-) In the initrd we produce, it loads all the SCSI/SATA/etc drivers first, then calls mkblkdevs which forces all of the devices to appear in /dev, and only then does it start the mdadm/lvm configuration. Daniel, I make no promises what so ever that this will even work at all as it may fail to load modules or all other sorts of weirdness, but if you want to test the theory, you can download the latest mkinitrd from fedoraproject.org, then use it to create an initrd image under some other name than your default image name, then manually edit your boot to have an extra stanza that uses the mkinitrd generated initrd image instead of the ubuntu image, and then just see if it brings the md device up cleanly instead of in degraded mode. That should be a fairly quick and easy way to test if Neil's analysis of the udev script was right. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 15:47 ` Doug Ledford @ 2007-10-29 21:29 ` Luca Berra 2007-10-29 23:15 ` Doug Ledford 0 siblings, 1 reply; 42+ messages in thread From: Luca Berra @ 2007-10-29 21:29 UTC (permalink / raw) To: linux-raid On Mon, Oct 29, 2007 at 11:47:19AM -0400, Doug Ledford wrote: >On Mon, 2007-10-29 at 09:18 +0100, Luca Berra wrote: >> On Sun, Oct 28, 2007 at 10:59:01PM -0700, Daniel L. Miller wrote: >> >Doug Ledford wrote: >> >>Anyway, I happen to *like* the idea of using full disk devices, but the >> >>reality is that the md subsystem doesn't have exclusive ownership of the >> >>disks at all times, and without that it really needs to stake a claim on >> >>the space instead of leaving things to chance IMO. >> >> >> >I've been re-reading this post numerous times - trying to ignore the >> >burgeoning flame war :) - and this last sentence finally clicked with me. >> > >> I am sorry Daniel, when i read Doug and Bill, stating that your issue >> was not having a partition table, i immediately took the bait and forgot >> about your original issue. > >I never said *his* issue was lack of partition table, I just said I >don't recommend that because it's flaky. The last statement I made maybe i misread you but Bill was quite clear. >about his issue was to ask about whether the problem was happening >during initrd time or sysinit time to try and identify if it was failing >before or after / was mounted to try and determine where the issue might >lay. Then we got off on the tangent about partitions, and at the same >time Neil started asking about udev, at which point it came out that >he's running ubuntu, and as much as I would like to help, the fact of >the matter is that I've never touched ubuntu and wouldn't have the >faintest clue, so I let Neil handle it. At which point he found that >the udev scripts in ubuntu are being stupid, and from the looks of it >are the cause of the problem. So, I've considered the initial issue >root caused for a bit now. It seems i made an idiot of myself by missing half of the thread, and i even knew ubuntu was braindead in their use of udev at startup, since a similar discussion came up on the lvm or the dm-devel mailing list (that time iirc it was about lvm over multipath) >> like udev/hal that believes it knows better than you about what you have >> on your disks. >> but _NEITHER OF THESE IS YOUR PROBLEM_ imho > >Actually, it looks like udev *is* the problem, but not because of >partition tables. you are right. L. -- Luca Berra -- bluca@comedia.it Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 21:29 ` Luca Berra @ 2007-10-29 23:15 ` Doug Ledford 2007-10-30 0:03 ` Daniel L. Miller 0 siblings, 1 reply; 42+ messages in thread From: Doug Ledford @ 2007-10-29 23:15 UTC (permalink / raw) To: Luca Berra; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1185 bytes --] On Mon, 2007-10-29 at 22:29 +0100, Luca Berra wrote: > At which point he found that > >the udev scripts in ubuntu are being stupid, and from the looks of it > >are the cause of the problem. So, I've considered the initial issue > >root caused for a bit now. > It seems i made an idiot of myself by missing half of the thread, and i > even knew ubuntu was braindead in their use of udev at startup, since a > similar discussion came up on the lvm or the dm-devel mailing list (that > time iirc it was about lvm over multipath) Nah. Even if we had concluded that udev was to blame here, I'm not entirely certain that we hadn't left Daniel with the impression that we suspected it versus blamed it, so reiterating it doesn't hurt. And I'm sure no one has given him a fix for the problem (although Neil did request a change that will give debug output, but not solve the problem), so not dropping it entirely would seem appropriate as well. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 23:15 ` Doug Ledford @ 2007-10-30 0:03 ` Daniel L. Miller 2007-11-01 13:56 ` Bill Davidsen 2007-12-17 14:58 ` Daniel L. Miller 0 siblings, 2 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-10-30 0:03 UTC (permalink / raw) To: linux-raid Doug Ledford wrote: > Nah. Even if we had concluded that udev was to blame here, I'm not > entirely certain that we hadn't left Daniel with the impression that we > suspected it versus blamed it, so reiterating it doesn't hurt. And I'm > sure no one has given him a fix for the problem (although Neil did > request a change that will give debug output, but not solve the > problem), so not dropping it entirely would seem appropriate as well. > I've opened a bug report on Ubuntu's Launchpad.net. Scott James Remnant asked me to cc him on Neil's incremental reference - we'll see what happens from here. Thanks for the help guys. At the moment, I've changed my mdadm.conf to explicitly list the drives, instead of the auto=partition parameter. We'll see what happens on the next reboot. I don't know if it means anything, but I'm using a self-compiled 2.6.22 kernel - with initrd. At least I THINK I'm using initrd - I have an image, but I don't see an initrd line in my grub config. Hmm....I'm going to add a stanza that includes the initrd and see what happens also. -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-30 0:03 ` Daniel L. Miller @ 2007-11-01 13:56 ` Bill Davidsen 2007-12-17 14:58 ` Daniel L. Miller 1 sibling, 0 replies; 42+ messages in thread From: Bill Davidsen @ 2007-11-01 13:56 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid Daniel L. Miller wrote: > Doug Ledford wrote: >> Nah. Even if we had concluded that udev was to blame here, I'm not >> entirely certain that we hadn't left Daniel with the impression that we >> suspected it versus blamed it, so reiterating it doesn't hurt. And I'm >> sure no one has given him a fix for the problem (although Neil did >> request a change that will give debug output, but not solve the >> problem), so not dropping it entirely would seem appropriate as well. >> > I've opened a bug report on Ubuntu's Launchpad.net. Scott James > Remnant asked me to cc him on Neil's incremental reference - we'll see > what happens from here. > > Thanks for the help guys. At the moment, I've changed my mdadm.conf > to explicitly list the drives, instead of the auto=partition > parameter. We'll see what happens on the next reboot. > > I don't know if it means anything, but I'm using a self-compiled > 2.6.22 kernel - with initrd. At least I THINK I'm using initrd - I > have an image, but I don't see an initrd line in my grub config. > Hmm....I'm going to add a stanza that includes the initrd and see what > happens also. > What did that do? -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-30 0:03 ` Daniel L. Miller 2007-11-01 13:56 ` Bill Davidsen @ 2007-12-17 14:58 ` Daniel L. Miller 1 sibling, 0 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-12-17 14:58 UTC (permalink / raw) To: linux-raid Daniel L. Miller wrote: > Doug Ledford wrote: >> Nah. Even if we had concluded that udev was to blame here, I'm not >> entirely certain that we hadn't left Daniel with the impression that we >> suspected it versus blamed it, so reiterating it doesn't hurt. And I'm >> sure no one has given him a fix for the problem (although Neil did >> request a change that will give debug output, but not solve the >> problem), so not dropping it entirely would seem appropriate as well. >> > I've opened a bug report on Ubuntu's Launchpad.net. Scott James > Remnant asked me to cc him on Neil's incremental reference - we'll see > what happens from here. > > Thanks for the help guys. At the moment, I've changed my mdadm.conf > to explicitly list the drives, instead of the auto=partition > parameter. We'll see what happens on the next reboot. > > I don't know if it means anything, but I'm using a self-compiled > 2.6.22 kernel - with initrd. At least I THINK I'm using initrd - I > have an image, but I don't see an initrd line in my grub config. > Hmm....I'm going to add a stanza that includes the initrd and see what > happens also. > Wow. Been a while since I asked about this - I just realized a reboot or two has come and gone. I checked my md status - everything was online! Cool. My current dmesg output: sata_nv 0000:00:07.0: version 3.4 ACPI: PCI Interrupt Link [LTID] enabled at IRQ 23 ACPI: PCI Interrupt 0000:00:07.0[A] -> Link [LTID] -> GSI 23 (level, high) -> IR Q 23 sata_nv 0000:00:07.0: Using ADMA mode PCI: Setting latency timer of device 0000:00:07.0 to 64 scsi0 : sata_nv scsi1 : sata_nv ata1: SATA max UDMA/133 cmd 0xffffc20001428480 ctl 0xffffc200014284a0 bmdma 0x00 00000000011410 irq 23 ata2: SATA max UDMA/133 cmd 0xffffc20001428580 ctl 0xffffc200014285a0 bmdma 0x00 00000000011418 irq 23 ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata1.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata1.00: configured for UDMA/133 ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata2.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata2.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata2.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata1: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 scsi 1:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata2: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 ACPI: PCI Interrupt Link [LSI1] enabled at IRQ 22 ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LSI1] -> GSI 22 (level, high) -> IR Q 22 sata_nv 0000:00:08.0: Using ADMA mode PCI: Setting latency timer of device 0000:00:08.0 to 64 scsi2 : sata_nv scsi3 : sata_nv ata3: SATA max UDMA/133 cmd 0xffffc2000142a480 ctl 0xffffc2000142a4a0 bmdma 0x00 00000000011420 irq 22 ata4: SATA max UDMA/133 cmd 0xffffc2000142a580 ctl 0xffffc2000142a5a0 bmdma 0x00 00000000011428 irq 22 ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata3.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata3.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata3.00: configured for UDMA/133 ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata4.00: ATA-7: ST3160811AS, 3.AAE, max UDMA/133 ata4.00: 312581808 sectors, multi 16: LBA48 NCQ (depth 31/32) ata4.00: configured for UDMA/133 scsi 2:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata3: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 scsi 3:0:0:0: Direct-Access ATA ST3160811AS 3.AA PQ: 0 ANSI: 5 ata4: bounce limit 0xFFFFFFFFFFFFFFFF, segment boundary 0xFFFFFFFF, hw segs 61 sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: unknown partition table sd 0:0:0:0: [sda] Attached SCSI disk sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 1:0:0:0: [sdb] 312581808 512-byte hardware sectors (160042 MB) sd 1:0:0:0: [sdb] Write Protect is off sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00 sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: unknown partition table sd 1:0:0:0: [sdb] Attached SCSI disk sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 2:0:0:0: [sdc] 312581808 512-byte hardware sectors (160042 MB) sd 2:0:0:0: [sdc] Write Protect is off sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdc: unknown partition table sd 2:0:0:0: [sdc] Attached SCSI disk sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 3:0:0:0: [sdd] 312581808 512-byte hardware sectors (160042 MB) sd 3:0:0:0: [sdd] Write Protect is off sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00 sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdd: unknown partition table sd 3:0:0:0: [sdd] Attached SCSI disk Adding 8000328k swap on /dev/hda5. Priority:-1 extents:1 across:8000328k EXT3 FS on hda1, internal journal device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com md: md0 stopped. md: bind<sdb> md: bind<sdc> md: bind<sdd> md: bind<sda> md: md0: raid array is not clean -- starting background reconstruction raid10: raid set md0 active with 4 out of 4 devices md: resync of RAID array md0 md: minimum _guaranteed_ speed: 1000 KB/sec/disk. md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync. md: using 128k window, over a total of 312581632 blocks. tg3: eth0: Link is up at 1000 Mbps, full duplex. tg3: eth0: Flow control is on for TX and on for RX. Filesystem "md0": Disabling barriers, not supported by the underlying device XFS mounting filesystem md0 Starting XFS recovery on filesystem: md0 (logdev: internal) Ending XFS recovery on filesystem: md0 (logdev: internal) XFS mounting filesystem hda2 Starting XFS recovery on filesystem: hda2 (logdev: internal) Ending XFS recovery on filesystem: hda2 (logdev: internal) XFS mounting filesystem hda3 Starting XFS recovery on filesystem: hda3 (logdev: internal) Ending XFS recovery on filesystem: hda3 (logdev: internal) NET: Registered protocol family 10 lo: Disabled Privacy Extensions tun: Universal TUN/TAP device driver, 1.6 tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com> PM: Writing back config space on device 0000:0a:09.1 at offset b (was 164814e4, writing 164414e4) PM: Writing back config space on device 0000:0a:09.1 at offset 3 (was 804000, wr iting 804010) PM: Writing back config space on device 0000:0a:09.1 at offset 2 (was 2000000, w riting 2000003) PM: Writing back config space on device 0000:0a:09.1 at offset 1 (was 2b00000, w riting 2b00106) ADDRCONF(NETDEV_UP): eth1: link is not ready Bridge firewalling registered device eth1 entered promiscuous mode audit(1197159016.060:2): dev=eth1 prom=256 old_prom=0 auid=4294967295 device tap1 entered promiscuous mode audit(1197159016.060:3): dev=tap1 prom=256 old_prom=0 auid=4294967295 br1: starting userspace STP failed, staring kernel STP br1: port 2(tap1) entering listening state tg3: eth1: Link is up at 1000 Mbps, full duplex. tg3: eth1: Flow control is on for TX and on for RX. ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready br1: port 1(eth1) entering listening state br1: port 2(tap1) entering learning state br1: port 1(eth1) entering learning state eth0: no IPv6 routers present br1: topology change detected, propagating br1: port 2(tap1) entering forwarding state tap1: no IPv6 routers present br1: no IPv6 routers present br1: topology change detected, propagating br1: port 1(eth1) entering forwarding state eth1: no IPv6 routers present ip_tables: (C) 2000-2006 Netfilter Core Team Netfilter messages via NETLINK v0.30. nf_conntrack version 0.5.0 (8192 buckets, 65536 max) tun0: Disabled Privacy Extensions parport_pc 00:0c: reported by Plug and Play ACPI parport0: PC-style at 0x378 (0x778), irq 7, dma 3 [PCSPP,TRISTATE,COMPAT,EPP,ECP ,DMA] lp0: using parport0 (interrupt-driven). NET: Registered protocol family 17 vmmon: module license 'unspecified' taints kernel. /dev/vmmon[4622]: VMCI: Driver initialized. /dev/vmmon[4622]: Module vmmon: registered with major=10 minor=165 /dev/vmmon[4622]: Module vmmon: initialized /dev/vmnet: open called by PID 4649 (vmnet-bridge) /dev/vmnet: hub 0 does not exist, allocating memory. /dev/vmnet: port on hub 0 successfully opened bridge-br1: enabling the bridge bridge-br1: up bridge-br1: already up bridge-br1: attached /dev/vmnet: open called by PID 4663 (vmnet-natd) /dev/vmnet: hub 8 does not exist, allocating memory. /dev/vmnet: port on hub 8 successfully opened /dev/vmnet: open called by PID 4668 (vmnet-netifup) /dev/vmnet: port on hub 8 successfully opened /dev/vmnet: open called by PID 4679 (vmnet-dhcpd) /dev/vmnet: port on hub 8 successfully opened vmnet8: no IPv6 routers present /dev/vmnet: open called by PID 4798 (vmware-vmx) device br1 entered promiscuous mode audit(1197159105.109:4): dev=br1 prom=256 old_prom=0 auid=4294967295 bridge-br1: enabled promiscuous mode /dev/vmnet: port on hub 0 successfully opened /dev/vmmon[4864]: host clock rate change request 0 -> 19 /dev/vmmon[4864]: host clock rate change request 19 -> 83 device br1 left promiscuous mode audit(1197159183.647:5): dev=br1 prom=0 old_prom=256 auid=4294967295 bridge-br1: disabled promiscuous mode /dev/vmnet: open called by PID 4864 (vmware-vmx) device br1 entered promiscuous mode audit(1197159183.647:6): dev=br1 prom=256 old_prom=0 auid=4294967295 bridge-br1: enabled promiscuous mode /dev/vmnet: port on hub 0 successfully opened /dev/vmnet: open called by PID 4945 (vmware-vmx) /dev/vmnet: port on hub 0 successfully opened /dev/vmnet: open called by PID 4983 (vmware-vmx) /dev/vmnet: port on hub 0 successfully opened md: md0: resync done. RAID10 conf printout: --- wd:4 rd:4 disk 0, wo:0, o:1, dev:sda disk 1, wo:0, o:1, dev:sdb disk 2, wo:0, o:1, dev:sdc disk 3, wo:0, o:1, dev:sdd /dev/vmnet: open called by PID 4983 (vmware-vmx) /dev/vmnet: port on hub 0 successfully opened vmmon: Had to deallocate locked 118026 pages from vm driver ffff810123e5a000 vmmon: Had to deallocate AWE 3437 pages from vm driver ffff810123e5a000 -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 5:59 ` Daniel L. Miller 2007-10-29 8:18 ` Luca Berra @ 2007-10-29 17:08 ` Doug Ledford 2007-10-29 18:56 ` Richard Scobie 2 siblings, 0 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-29 17:08 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 6078 bytes --] On Sun, 2007-10-28 at 22:59 -0700, Daniel L. Miller wrote: > Doug Ledford wrote: > > Anyway, I happen to *like* the idea of using full disk devices, but the > > reality is that the md subsystem doesn't have exclusive ownership of the > > disks at all times, and without that it really needs to stake a claim on > > the space instead of leaving things to chance IMO. > > > I've been re-reading this post numerous times - trying to ignore the > burgeoning flame war :) - and this last sentence finally clicked with me. > > As I'm a novice Linux user - and not involved in development at all - > bear with me if I'm stating something obvious. And if I'm wrong - > please be gentle! > > 1. md devices are not "native" to the kernel - they are > created/assembled/activated/whatever by a userspace program. My real point was that md doesn't own the disks, meaning that during startup, and at other points in time, software other than the md stack can attempt to use the disk directly. That software may be the linux file system code, linux lvm code, or in some case entirely different OS software. Given that these situations can arise, using a partition table to mark the space as in use by linux is what I meant by staking a claim. It doesn't keep the linux kernel from using it because it thinks it owns it, but it does stop other software from attempting to use it. > 2. Because md devices are "non-native" devices, and are composed of > "native" devices, the kernel may try to use those components directly > without going through md. In the case of superblocks at the end, yes. The kernel may see the underlying file system or lvm disk label even if the md device is not started. > 3. Creating a partition table somehow (I'm still not clear how/why) > reduces the chance the kernel will access the drive directly without md. The partition table is more to tell other software that linux owns the space and to avoid mistakes where someone runs fdisk on a disk accidentally and wipes out your array because they added a partition table on what they thought was a new disk (more likely when you have large arrays of disks attached via fiber channel or such than in a single system). Putting the superblock at the beginning of the md device is the main thing that guarantees the kernel will never try to use what's inside the md device without the md device running. > These concepts suddenly have me terrified over my data integrity. Is > the md system so delicate that BOOT sequence can corrupt it? If you have your superblocks at the end of the devices, then there are certain failure modes that can cause data inconsistencies. Generally speaking they won't harm the array itself, it's just that the different disks in a raid1 array might contain different data. If you don't use partitions, then the majority of failure scenarios involve things like accidental use of fdisk on the unpartitioned device, access of the device by other OSes, that sort of thing. > How is it > more reliable AFTER the completed boot sequence? Once the array is up and running, the constituent disks are marked as busy in the operating system, which prevents other portions of the linux kernel and other software in general from getting at the md owned disks. > Nothing in the documentation (that I read - granted I don't always read > everything) stated that partitioning prior to md creation was necessary > - in fact references were provided on how to use complete disks. Is > there an "official" position on, "To Partition, or Not To Partition"? > Particularly for my application - dedicated Linux server, RAID-10 > configuration, identical drives. > > And if partitioning is the answer - what do I need to do with my live > dataset? Drop one drive, partition, then add the partition as a new > drive to the set - and repeat for each drive after the rebuild finishes? You *probably*, and I emphasize probably, don't need to do anything. I emphasize it because I don't know enough about your situation to say so with 100% certainty. If I'm wrong, it's not my fault. Now, that said, here's the gist of the situation. There are specific failure cases that can corrupt data in an md raid1 array mainly related to superblocks at the end of devices. There are specific failure cases where an unpartitioned device can be accidentally partitioned or where a partitioned md array in combination with superblocks at the end and using a whole disk device can be misrecognized as a partitioned normal drive. There are, on the other hand, cases where it's perfectly safe to use unpartitioned devices, or superblocks at the end of devices. My recommendation when someone asks what to do is to use partitions, and to use superblocks at the beginning of the devices (except for /boot since that isn't supported at the moment). The reason I give that advice is that I assume if a person knows enough to know when it's safe to use unpartitioned devices, like Luca, then they wouldn't be asking me for advice. So since they *are* asking my advice, and since a lot of the failure cases have as much to do with human error as they do with software error, and since human error always seems to find new ways to err, it's therefore impossible to list all the error cases, and so it's best just to give the known safe advice. Just because you heard the advice after creating your arrays is no reason to panic though. Since the disks are local to your linux server and not attached via a fiber channel network or something similar, about 2/3rds of the failure cases drop away immediately. And given that you are using raid10 instead of raid1, the possible silent inconsistency issue drops away. All in all, your pretty safe. -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-29 5:59 ` Daniel L. Miller 2007-10-29 8:18 ` Luca Berra 2007-10-29 17:08 ` Doug Ledford @ 2007-10-29 18:56 ` Richard Scobie 2 siblings, 0 replies; 42+ messages in thread From: Richard Scobie @ 2007-10-29 18:56 UTC (permalink / raw) To: linux-raid Daniel L. Miller wrote: > Nothing in the documentation (that I read - granted I don't always read > everything) stated that partitioning prior to md creation was necessary > - in fact references were provided on how to use complete disks. Is > there an "official" position on, "To Partition, or Not To Partition"? > Particularly for my application - dedicated Linux server, RAID-10 > configuration, identical drives. My simplistic reason for always making one partition on md drives, about 100MB smaller than the full space, has been as insurance to allow use of a replacement drive from another manufacturer, which while nominally marked as the same size as the originals, is in fact slightly smaller. Regards, Richard ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-24 14:22 ` Daniel L. Miller 2007-10-24 16:25 ` Doug Ledford 2007-10-24 20:01 ` Bill Davidsen @ 2007-10-25 6:12 ` Neil Brown 2007-10-25 6:51 ` Doug Ledford ` (3 more replies) 2 siblings, 4 replies; 42+ messages in thread From: Neil Brown @ 2007-10-25 6:12 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid On Wednesday October 24, dmiller@amfes.com wrote: > Current mdadm.conf: > DEVICE partitions > ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 > UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part > > still have the problem where on boot one drive is not part of the > array. Is there a log file I can check to find out WHY a drive is not > being added? It's been a while since the reboot, but I did find some > entries in dmesg - I'm appending both the md lines and the physical disk > related lines. The bottom shows one disk not being added (this time is > was sda) - and the disk that gets skipped on each boot seems to be > random - there's no consistent failure: Odd.... but interesting. Does it sometimes fail to start the array altogether? > md: md0 stopped. > md: md0 stopped. > md: bind<sdc> > md: bind<sdd> > md: bind<sdb> > md: md0: raid array is not clean -- starting background reconstruction > raid10: raid set md0 active with 3 out of 4 devices > md: couldn't update array info. -22 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ This is the most surprising line, and hence the one most likely to convey helpful information. This message is generated when a process calls "SET_ARRAY_INFO" on an array that is already running, and the changes implied by the new "array_info" are not supportable. The only way I can see this happening is if two copies of "mdadm" are running at exactly the same time and are both are trying to assemble the same array. The first calls SET_ARRAY_INFO and assembles the (partial) array. The second calls SET_ARRAY_INFO and gets this error. Not all devices are included because while when one mdadm when to look, at a device, the other has it locked and so the first just ignored it. I just tried that, and sometimes it worked, but sometimes it assembled with 3 out of 4 devices. I didn't get the "couldn't update array info" message, but that doesn't prove I'm wrong. I cannot imagine how that might be happening (two at once) unless maybe 'udev' had been configured to do something as soon as devices were discovered.... seems unlikely. It might be worth finding out where mdadm is being run in the init scripts and add a "-v" flag, and redirecting stdout/stderr to some log file. e.g. mdadm -As -v > /var/log/mdadm-$$ 2>&1 And see if that leaves something useful in the log file. BTW, I don't think your problem has anything to do with the fact that you are using whole partitions. While it is debatable whether that is a good idea or not (I like the idea, but Doug doesn't and I respect his opinion) I doubt it would contribute to the current problem. Your description makes me nearly certain that there is some sort of race going on (that is the easiest way to explain randomly differing behaviours). The race is probably between different code 'locking' (opening with O_EXCL) the various devices. Give the above error message, two different 'mdadm's seems most likely, but an mdadm and a mount-by-label scan could probably do it too. NeilBrown ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 6:12 ` Neil Brown @ 2007-10-25 6:51 ` Doug Ledford 2007-10-25 13:13 ` Daniel L. Miller ` (2 subsequent siblings) 3 siblings, 0 replies; 42+ messages in thread From: Doug Ledford @ 2007-10-25 6:51 UTC (permalink / raw) To: Neil Brown; +Cc: Daniel L. Miller, linux-raid [-- Attachment #1: Type: text/plain, Size: 1623 bytes --] On Thu, 2007-10-25 at 16:12 +1000, Neil Brown wrote: > > md: md0 stopped. > > md: md0 stopped. > > md: bind<sdc> > > md: bind<sdd> > > md: bind<sdb> > > md: md0: raid array is not clean -- starting background reconstruction > > raid10: raid set md0 active with 3 out of 4 devices > > md: couldn't update array info. -22 > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This is the most surprising line, and hence the one most likely to > convey helpful information. > > This message is generated when a process calls "SET_ARRAY_INFO" on an > array that is already running, and the changes implied by the new > "array_info" are not supportable. > > The only way I can see this happening is if two copies of "mdadm" are > running at exactly the same time and are both are trying to assemble > the same array. The first calls SET_ARRAY_INFO and assembles the > (partial) array. The second calls SET_ARRAY_INFO and gets this error. > Not all devices are included because while when one mdadm when to > look, at a device, the other has it locked and so the first just > ignored it. If mdadm copy A gets three of the devices, I wouldn't think mdadm copy B would have been able to get enough devices to decide to even try and assemble the array (assuming that once copy A locked the devices during open, that it then held the devices until time to assemble the array). -- Doug Ledford <dledford@redhat.com> GPG KeyID: CFBFF194 http://people.redhat.com/dledford Infiniband specific RPMs available at http://people.redhat.com/dledford/Infiniband [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 6:12 ` Neil Brown 2007-10-25 6:51 ` Doug Ledford @ 2007-10-25 13:13 ` Daniel L. Miller 2007-10-25 13:33 ` Daniel L. Miller 2007-10-25 14:46 ` Bill Davidsen 3 siblings, 0 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-10-25 13:13 UTC (permalink / raw) To: linux-raid Neil Brown wrote: > It might be worth finding out where mdadm is being run in the init > scripts and add a "-v" flag, and redirecting stdout/stderr to some log > file. > e.g. > mdadm -As -v > /var/log/mdadm-$$ 2>&1 > > And see if that leaves something useful in the log file. > > I haven't rebooted yet, but here's my /etc/udev/rules.d/70-mdadm.rules file (BTW - running on Ubuntu 7.10 Gutsy): SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", RUN+="watershed -i udev-mdadm /sbin/mdadm -As -v > /var/log/mdadm-$$ 2>&1" # This next line (only) is put into the initramfs, # where we run a strange script to activate only some of the arrays # as configured, instead of mdadm -As: #initramfs# SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", RUN+="watershed -i udev-mdadm /scripts/local-top/mdadm from-udev" Could that initramfs line be causing the problem? -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 6:12 ` Neil Brown 2007-10-25 6:51 ` Doug Ledford 2007-10-25 13:13 ` Daniel L. Miller @ 2007-10-25 13:33 ` Daniel L. Miller 2007-10-26 6:12 ` Neil Brown 2007-10-25 14:46 ` Bill Davidsen 3 siblings, 1 reply; 42+ messages in thread From: Daniel L. Miller @ 2007-10-25 13:33 UTC (permalink / raw) To: linux-raid Neil Brown wrote: > It might be worth finding out where mdadm is being run in the init > scripts and add a "-v" flag, and redirecting stdout/stderr to some log > file. > e.g. > mdadm -As -v > /var/log/mdadm-$$ 2>&1 > > And see if that leaves something useful in the log file. > > I haven't rebooted yet, but here's my /etc/udev/rules.d/70-mdadm.rules file (BTW - running on Ubuntu 7.10 Gutsy): SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", RUN+="watershed -i udev-mdadm /sbin/mdadm -As -v > /var/log/mdadm-$$ 2>&1" # This next line (only) is put into the initramfs, # where we run a strange script to activate only some of the arrays # as configured, instead of mdadm -As: #initramfs# SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", RUN+="watershed -i udev-mdadm /scripts/local-top/mdadm from-udev" Could that initramfs line be causing the problem? -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 13:33 ` Daniel L. Miller @ 2007-10-26 6:12 ` Neil Brown 0 siblings, 0 replies; 42+ messages in thread From: Neil Brown @ 2007-10-26 6:12 UTC (permalink / raw) To: Daniel L. Miller; +Cc: linux-raid On Thursday October 25, dmiller@amfes.com wrote: > Neil Brown wrote: > > It might be worth finding out where mdadm is being run in the init > > scripts and add a "-v" flag, and redirecting stdout/stderr to some log > > file. > > e.g. > > mdadm -As -v > /var/log/mdadm-$$ 2>&1 > > > > And see if that leaves something useful in the log file. > > > > > I haven't rebooted yet, but here's my /etc/udev/rules.d/70-mdadm.rules > file (BTW - running on Ubuntu 7.10 Gutsy): > > SUBSYSTEM=="block", ACTION=="add|change", > ENV{ID_FS_TYPE}=="linux_raid*", RUN+="watershed -i udev-mdadm > /sbin/mdadm -As -v > /var/log/mdadm-$$ 2>&1" Yes, that would do exactly what you are experiencing. Every time a component of a raid array is discovered, it will try to assemble all known arrays. So one drive appears, it tries to assemble the array but there aren't enough so it gives up. Then two drives. Chances are there still aren't enough, so it gives up again. Then when there are three drives it will successfully assemble the array - degraded. Then when there are 4 drives, it will be too late. I cannot see why that would lead to the "cannot update array info" error, but it certainly explains the rest. That is really bad stuff to have in udev. The "--incremental" mode was written precisely for use in udev. I wonder why they didn't use it.... Maybe you should log a bug report with Ubuntu and suggest they discuss their udev scripts with the developer of mdadm (that would be me I guess). NeilBrown ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 6:12 ` Neil Brown ` (2 preceding siblings ...) 2007-10-25 13:33 ` Daniel L. Miller @ 2007-10-25 14:46 ` Bill Davidsen 2007-10-25 16:13 ` Daniel L. Miller 2007-10-26 5:59 ` Neil Brown 3 siblings, 2 replies; 42+ messages in thread From: Bill Davidsen @ 2007-10-25 14:46 UTC (permalink / raw) To: Neil Brown; +Cc: Daniel L. Miller, linux-raid Neil Brown wrote: > On Wednesday October 24, dmiller@amfes.com wrote: > >> Current mdadm.conf: >> DEVICE partitions >> ARRAY /dev/.static/dev/md0 level=raid10 num-devices=4 >> UUID=9d94b17b:f5fac31a:577c252b:0d4c4b2a auto=part >> >> still have the problem where on boot one drive is not part of the >> array. Is there a log file I can check to find out WHY a drive is not >> being added? It's been a while since the reboot, but I did find some >> entries in dmesg - I'm appending both the md lines and the physical disk >> related lines. The bottom shows one disk not being added (this time is >> was sda) - and the disk that gets skipped on each boot seems to be >> random - there's no consistent failure: >> > > Odd.... but interesting. > Does it sometimes fail to start the array altogether? > > >> md: md0 stopped. >> md: md0 stopped. >> md: bind<sdc> >> md: bind<sdd> >> md: bind<sdb> >> md: md0: raid array is not clean -- starting background reconstruction >> raid10: raid set md0 active with 3 out of 4 devices >> md: couldn't update array info. -22 >> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > This is the most surprising line, and hence the one most likely to > convey helpful information. > > This message is generated when a process calls "SET_ARRAY_INFO" on an > array that is already running, and the changes implied by the new > "array_info" are not supportable. > > The only way I can see this happening is if two copies of "mdadm" are > running at exactly the same time and are both are trying to assemble > the same array. The first calls SET_ARRAY_INFO and assembles the > (partial) array. The second calls SET_ARRAY_INFO and gets this error. > Not all devices are included because while when one mdadm when to > look, at a device, the other has it locked and so the first just > ignored it. > > I just tried that, and sometimes it worked, but sometimes it assembled > with 3 out of 4 devices. I didn't get the "couldn't update array info" > message, but that doesn't prove I'm wrong. > > I cannot imagine how that might be happening (two at once) unless > maybe 'udev' had been configured to do something as soon as devices > were discovered.... seems unlikely. > > It might be worth finding out where mdadm is being run in the init > scripts and add a "-v" flag, and redirecting stdout/stderr to some log > file. > e.g. > mdadm -As -v > /var/log/mdadm-$$ 2>&1 > > And see if that leaves something useful in the log file. > > BTW, I don't think your problem has anything to do with the fact that > you are using whole partitions. > You don't think the "unknown partition table" on sdd is related? Because I read that as a sure indication that the system isn't considering the drive as one without a partition table, and therefore isn't looking for the superblock on the whole device. And as Doug pointed out, once you decide that there is a partition table lots of things might try to use it. > While it is debatable whether that is a good idea or not (I like the > idea, but Doug doesn't and I respect his opinion) I doubt it would > contribute to the current problem. > > > Your description makes me nearly certain that there is some sort of > race going on (that is the easiest way to explain randomly differing > behaviours). The race is probably between different code 'locking' > (opening with O_EXCL) the various devices. Give the above error > message, two different 'mdadm's seems most likely, but an mdadm and a > mount-by-label scan could probably do it too. > -- bill davidsen <davidsen@tmr.com> CTO TMR Associates, Inc Doing interesting things with small computers since 1979 ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 14:46 ` Bill Davidsen @ 2007-10-25 16:13 ` Daniel L. Miller 2007-10-26 5:59 ` Neil Brown 1 sibling, 0 replies; 42+ messages in thread From: Daniel L. Miller @ 2007-10-25 16:13 UTC (permalink / raw) To: linux-raid Bill Davidsen wrote: > You don't think the "unknown partition table" on sdd is related? > Because I read that as a sure indication that the system isn't > considering the drive as one without a partition table, and therefore > isn't looking for the superblock on the whole device. And as Doug > pointed out, once you decide that there is a partition table lots of > things might try to use it. Now, would the drive "letters" (sd[a-d]) change from reboot-to-reboot? Because it's not consistent - so far I've seen each of the four drives at one time or another fail during the boot. I've added the verbose logging to the udev mdadm rule, and I've also manually specified the drives in mdadm.conf instead of leaving it on auto. Curious what the next boot will bring. -- Daniel ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: Raid-10 mount at startup always has problem 2007-10-25 14:46 ` Bill Davidsen 2007-10-25 16:13 ` Daniel L. Miller @ 2007-10-26 5:59 ` Neil Brown 1 sibling, 0 replies; 42+ messages in thread From: Neil Brown @ 2007-10-26 5:59 UTC (permalink / raw) To: Bill Davidsen; +Cc: Daniel L. Miller, linux-raid On Thursday October 25, davidsen@tmr.com wrote: > Neil Brown wrote: > > > > BTW, I don't think your problem has anything to do with the fact that > > you are using whole partitions. > > > > You don't think the "unknown partition table" on sdd is related? Because > I read that as a sure indication that the system isn't considering the > drive as one without a partition table, and therefore isn't looking for > the superblock on the whole device. And as Doug pointed out, once you > decide that there is a partition table lots of things might try to use it. "unknown partition table" is what I would expect when using whole drive. It just mean "the first block doesn't look like a partition table", and if you have some early block of an ext3 (or other) filesystem in the first block (as you would in this case), you wouldn't expect it to look like a partition table. I don't understand what you are trying to say with your second sentence. NeilBrown ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2007-12-17 14:58 UTC | newest]
Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-27 18:14 Raid-10 mount at startup always has problem Daniel L. Miller
[not found] ` <46D49F1A.7030409@tmr.com>
2007-09-10 1:53 ` Daniel L. Miller
2007-09-10 2:04 ` Richard Scobie
[not found] ` <46E4A5F0.9090407@sauce.co.nz>
2007-09-10 2:11 ` Daniel L. Miller
2007-10-24 14:22 ` Daniel L. Miller
2007-10-24 16:25 ` Doug Ledford
2007-10-24 20:01 ` Bill Davidsen
2007-10-25 5:43 ` Daniel L. Miller
2007-10-25 6:40 ` Doug Ledford
2007-10-26 9:15 ` Luca Berra
2007-10-26 16:53 ` Gabor Gombas
2007-10-27 7:57 ` Luca Berra
2007-10-26 19:26 ` Doug Ledford
2007-10-27 7:50 ` Luca Berra
2007-10-27 15:07 ` Gabor Gombas
2007-10-27 20:47 ` Doug Ledford
2007-10-28 13:37 ` Luca Berra
2007-10-28 17:55 ` Doug Ledford
2007-10-29 0:21 ` Bill Davidsen
2007-10-29 7:41 ` Luca Berra
2007-10-29 13:22 ` Bill Davidsen
2007-10-29 15:21 ` Doug Ledford
2007-10-29 15:54 ` Gabor Gombas
2007-10-29 14:31 ` Doug Ledford
2007-10-29 5:59 ` Daniel L. Miller
2007-10-29 8:18 ` Luca Berra
2007-10-29 15:47 ` Doug Ledford
2007-10-29 21:29 ` Luca Berra
2007-10-29 23:15 ` Doug Ledford
2007-10-30 0:03 ` Daniel L. Miller
2007-11-01 13:56 ` Bill Davidsen
2007-12-17 14:58 ` Daniel L. Miller
2007-10-29 17:08 ` Doug Ledford
2007-10-29 18:56 ` Richard Scobie
2007-10-25 6:12 ` Neil Brown
2007-10-25 6:51 ` Doug Ledford
2007-10-25 13:13 ` Daniel L. Miller
2007-10-25 13:33 ` Daniel L. Miller
2007-10-26 6:12 ` Neil Brown
2007-10-25 14:46 ` Bill Davidsen
2007-10-25 16:13 ` Daniel L. Miller
2007-10-26 5:59 ` Neil Brown
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).