* vol_id and RAID1 members
@ 2006-01-24 22:30 Marco d'Itri
2006-01-24 23:55 ` Kay Sievers
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Marco d'Itri @ 2006-01-24 22:30 UTC (permalink / raw)
To: linux-hotplug
Today one of the disks in a mirrored RAID array failed, and I noticed
that the /dev/disk/by-label/ link had changed from md2 to hda6, the
failed device.
By looking at /dev/.udev/db/ I determined that vol_id returned
ID_FS_LABEL_SAFE=home for both md2 and hda6, because hda6 is not part of
the RAID array anymore and obviously both devices have identical file
system superblocks.
Having the by-label link change after a reboot is a big problem which
could easily cause data loss, but I am not sure about how to fix it.
Should vol_id be modified to look for the RAID superblock and if one is
present ignore the file system superblock?
--
ciao,
Marco
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid\x103432&bid#0486&dat\x121642
_______________________________________________
Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vol_id and RAID1 members
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
@ 2006-01-24 23:55 ` Kay Sievers
2006-01-27 13:06 ` Marco d'Itri
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kay Sievers @ 2006-01-24 23:55 UTC (permalink / raw)
To: linux-hotplug
On Tue, Jan 24, 2006 at 11:30:36PM +0100, Marco d'Itri wrote:
> Today one of the disks in a mirrored RAID array failed, and I noticed
> that the /dev/disk/by-label/ link had changed from md2 to hda6, the
> failed device.
> By looking at /dev/.udev/db/ I determined that vol_id returned
> ID_FS_LABEL_SAFE=home for both md2 and hda6, because hda6 is not part of
> the RAID array anymore and obviously both devices have identical file
> system superblocks.
>
> Having the by-label link change after a reboot is a big problem which
> could easily cause data loss, but I am not sure about how to fix it.
> Should vol_id be modified to look for the RAID superblock and if one is
> present ignore the file system superblock?
Sure, vol_id does look at the raid signature and ignores it, that's the
reason your raid did not show up in the past. Does the degration of the
device means that the raid signature is no longer valid?
What does the debug of vol_id print if run on the raid member? If it does
not show anything intersting, compile udev with DEBUG=true and run vol_id
with UDEV_LOG=7.
Kay
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid\x103432&bid#0486&dat\x121642
_______________________________________________
Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vol_id and RAID1 members
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
2006-01-24 23:55 ` Kay Sievers
@ 2006-01-27 13:06 ` Marco d'Itri
2006-01-27 16:02 ` Kay Sievers
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Marco d'Itri @ 2006-01-27 13:06 UTC (permalink / raw)
To: linux-hotplug
[-- Attachment #1: Type: text/plain, Size: 1236 bytes --]
On Jan 25, Kay Sievers <kay.sievers@vrfy.org> wrote:
> Sure, vol_id does look at the raid signature and ignores it, that's the
> reason your raid did not show up in the past. Does the degration of the
> device means that the raid signature is no longer valid?
Yes. Indeed, vol_id gets EIO when trying to read the superblock.
I think that in this case it should report the error and exit, because
as I showed if the partition really is an array member then it will
report wrong information which if used will cause data loss.
vol_id[8597]: main: BLKGETSIZE64=1999839744
vol_id[8597]: volume_id_probe_linux_raid: probing at offset 0x0, size 0x77332200
vol_id[8597]: volume_id_get_buffer: get buffer off 0x77320000(1999765504), len 0x800
vol_id[8597]: volume_id_get_buffer: read seekbuf off:0x77320000 len:0x800
vol_id[8597]: volume_id_get_buffer: read failed (Input/output error)
vol_id[8597]: volume_id_probe_intel_software_raid: probing at offset 0x0, size 0x77332200
vol_id[8597]: volume_id_get_buffer: get buffer off 0x77331e00(1999838720), len 0x200
vol_id[8597]: volume_id_get_buffer: read seekbuf off:0x77331e00 len:0x200
vol_id[8597]: volume_id_get_buffer: got 0x200 (512) bytes
[...]
--
ciao,
Marco
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vol_id and RAID1 members
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
2006-01-24 23:55 ` Kay Sievers
2006-01-27 13:06 ` Marco d'Itri
@ 2006-01-27 16:02 ` Kay Sievers
2006-01-27 16:35 ` Marco d'Itri
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Kay Sievers @ 2006-01-27 16:02 UTC (permalink / raw)
To: linux-hotplug
On Fri, Jan 27, 2006 at 02:06:15PM +0100, Marco d'Itri wrote:
> On Jan 25, Kay Sievers <kay.sievers@vrfy.org> wrote:
>
> > Sure, vol_id does look at the raid signature and ignores it, that's the
> > reason your raid did not show up in the past. Does the degration of the
> > device means that the raid signature is no longer valid?
> Yes. Indeed, vol_id gets EIO when trying to read the superblock.
>
> I think that in this case it should report the error and exit, because
> as I showed if the partition really is an array member then it will
> report wrong information which if used will cause data loss.
But how can returning an error by reading the very end of the device
be an indication for a raid device? If we can't find a raid signature,
we should look for a filesystem. I've seen some devices, where the
reported size is not fully readable and they would fail with such a logic,
which would break other things.
Kay
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid\x103432&bid#0486&dat\x121642
_______________________________________________
Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vol_id and RAID1 members
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
` (2 preceding siblings ...)
2006-01-27 16:02 ` Kay Sievers
@ 2006-01-27 16:35 ` Marco d'Itri
2006-01-27 16:48 ` Kay Sievers
2006-01-27 16:56 ` Marco d'Itri
5 siblings, 0 replies; 7+ messages in thread
From: Marco d'Itri @ 2006-01-27 16:35 UTC (permalink / raw)
To: linux-hotplug
[-- Attachment #1: Type: text/plain, Size: 984 bytes --]
On Jan 27, Kay Sievers <kay.sievers@vrfy.org> wrote:
> > Yes. Indeed, vol_id gets EIO when trying to read the superblock.
> >
> > I think that in this case it should report the error and exit, because
> > as I showed if the partition really is an array member then it will
> > report wrong information which if used will cause data loss.
>
> But how can returning an error by reading the very end of the device
> be an indication for a raid device? If we can't find a raid signature,
What I meant is that failure to read the partition should be an
indication for broken devices.
> we should look for a filesystem. I've seen some devices, where the
> reported size is not fully readable and they would fail with such a logic,
> which would break other things.
These devices looks broken or at best misconfigured.
In which sane and normal scenario would a partition have sectors which
return EIO when read, but be otherwise fully functional?
--
ciao,
Marco
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vol_id and RAID1 members
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
` (3 preceding siblings ...)
2006-01-27 16:35 ` Marco d'Itri
@ 2006-01-27 16:48 ` Kay Sievers
2006-01-27 16:56 ` Marco d'Itri
5 siblings, 0 replies; 7+ messages in thread
From: Kay Sievers @ 2006-01-27 16:48 UTC (permalink / raw)
To: linux-hotplug
On Fri, Jan 27, 2006 at 05:35:46PM +0100, Marco d'Itri wrote:
> On Jan 27, Kay Sievers <kay.sievers@vrfy.org> wrote:
>
> > > Yes. Indeed, vol_id gets EIO when trying to read the superblock.
> > >
> > > I think that in this case it should report the error and exit, because
> > > as I showed if the partition really is an array member then it will
> > > report wrong information which if used will cause data loss.
> >
> > But how can returning an error by reading the very end of the device
> > be an indication for a raid device? If we can't find a raid signature,
> What I meant is that failure to read the partition should be an
> indication for broken devices.
The "failure to read the partition" you mean is not to be able to
read the last few sectors? Yeah, it's some kind of broken, but it
happens and it is perfectly useable.
> > we should look for a filesystem. I've seen some devices, where the
> > reported size is not fully readable and they would fail with such a logic,
> > which would break other things.
> These devices looks broken or at best misconfigured.
> In which sane and normal scenario would a partition have sectors which
> return EIO when read, but be otherwise fully functional?
Oh, that's hardware magic and some devices just don't report the right
size and you need to do a binary search to determine the _real_ size.
We just can't ignore devices which report an incorrect size or depend on
the reported size to be completely correct in all cases.
The real failure to look for, is why your raid member has no longer a
signature at all, right?
Kay
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid\x103432&bid#0486&dat\x121642
_______________________________________________
Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: vol_id and RAID1 members
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
` (4 preceding siblings ...)
2006-01-27 16:48 ` Kay Sievers
@ 2006-01-27 16:56 ` Marco d'Itri
5 siblings, 0 replies; 7+ messages in thread
From: Marco d'Itri @ 2006-01-27 16:56 UTC (permalink / raw)
To: linux-hotplug
[-- Attachment #1: Type: text/plain, Size: 407 bytes --]
On Jan 27, Kay Sievers <kay.sievers@vrfy.org> wrote:
> The real failure to look for, is why your raid member has no longer a
> signature at all, right?
Because the disk is dieing, and these sectors are unreadable.
It's a corner case, but I'd rather see vol_id break with those other
broken devices than work with this broken one and risk mounting it
instead of the RAID array.
--
ciao,
Marco
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2006-01-27 16:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-24 22:30 vol_id and RAID1 members Marco d'Itri
2006-01-24 23:55 ` Kay Sievers
2006-01-27 13:06 ` Marco d'Itri
2006-01-27 16:02 ` Kay Sievers
2006-01-27 16:35 ` Marco d'Itri
2006-01-27 16:48 ` Kay Sievers
2006-01-27 16:56 ` Marco d'Itri
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).