* XFS Resiliency to the disk errors.
@ 2007-04-05 8:08 Zak, Semion
2007-04-05 16:06 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Zak, Semion @ 2007-04-05 8:08 UTC (permalink / raw)
To: xfs
Hi,
We are studying possibility to use XFS with cheap (not too reliable)
discs, so we have some questions:
What in XFS is done to survive the disk errors (bad sectors)?
I know about superblock duplication in every AG. What else?
What is XFS behavior in case of the disk errors (panic/no mount/partial
data access)?
What could be done to restore?
If zero bad sector/dump to other device/format/restore will help?
Thanks,
Semion.
[[HTML alternate version deleted]]
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: XFS Resiliency to the disk errors. 2007-04-05 8:08 XFS Resiliency to the disk errors Zak, Semion @ 2007-04-05 16:06 ` Eric Sandeen 2007-04-10 6:49 ` Zak, Semion 2007-04-06 18:49 ` Peter Grandi 2007-04-07 20:47 ` Martin Steigerwald 2 siblings, 1 reply; 5+ messages in thread From: Eric Sandeen @ 2007-04-05 16:06 UTC (permalink / raw) To: Zak, Semion; +Cc: xfs Zak, Semion wrote: > Hi, > > We are studying possibility to use XFS with cheap (not too reliable) > discs, so we have some questions: > > What in XFS is done to survive the disk errors (bad sectors)? > I know about superblock duplication in every AG. What else? > > What is XFS behavior in case of the disk errors (panic/no mount/partial > data access)? generally metadata IO errors or bad magic found in metadata will shut down the filesystem gracefully if it can. IO errors on data will just be IO errors. > What could be done to restore? xfsdump/xfsrestore I suppose > If zero bad sector/dump to other device/format/restore will help? Well, you can't make data out of nothing. you could dd off the junk drive, zeroing out unreadable sectors, point xfs_repair at it and hope for the best. Which, depending on the problem, could wind up not being very good. If you want to know how to recover from disaster, it sounds like perhaps your data is important enough that you should not plan for failure, but rather find a way to avoid it? Seems to me the only way I'd want to put drives which are expected to fail regularly into a product is if the recovery method of "replace the disk and re-image the appliance" was acceptable, but that's just me. :) -Eric ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: XFS Resiliency to the disk errors. 2007-04-05 16:06 ` Eric Sandeen @ 2007-04-10 6:49 ` Zak, Semion 0 siblings, 0 replies; 5+ messages in thread From: Zak, Semion @ 2007-04-10 6:49 UTC (permalink / raw) To: Eric Sandeen; +Cc: xfs Thank you very much. I have other question, about data lose on crash/power cut. Is it possible to make it not more then in other file systems, if open the important file with O_SYNC flag, or use fsync and sync functions? Thanks, Semion. -----Original Message----- From: Eric Sandeen [mailto:sandeen@sandeen.net] Sent: Thursday, April 05, 2007 7:07 PM To: Zak, Semion Cc: xfs@oss.sgi.com Subject: Re: XFS Resiliency to the disk errors. Zak, Semion wrote: > Hi, > > We are studying possibility to use XFS with cheap (not too reliable) > discs, so we have some questions: > > What in XFS is done to survive the disk errors (bad sectors)? > I know about superblock duplication in every AG. What else? > > What is XFS behavior in case of the disk errors (panic/no > mount/partial data access)? generally metadata IO errors or bad magic found in metadata will shut down the filesystem gracefully if it can. IO errors on data will just be IO errors. > What could be done to restore? xfsdump/xfsrestore I suppose > If zero bad sector/dump to other device/format/restore will help? Well, you can't make data out of nothing. you could dd off the junk drive, zeroing out unreadable sectors, point xfs_repair at it and hope for the best. Which, depending on the problem, could wind up not being very good. If you want to know how to recover from disaster, it sounds like perhaps your data is important enough that you should not plan for failure, but rather find a way to avoid it? Seems to me the only way I'd want to put drives which are expected to fail regularly into a product is if the recovery method of "replace the disk and re-image the appliance" was acceptable, but that's just me. :) -Eric *********************************************************************************** This email message and any attachments thereto are intended only for use by the addressee(s) named above, and may contain legally privileged and/or confidential information. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the postmaster@nds.com and destroy the original message. *********************************************************************************** ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: XFS Resiliency to the disk errors. 2007-04-05 8:08 XFS Resiliency to the disk errors Zak, Semion 2007-04-05 16:06 ` Eric Sandeen @ 2007-04-06 18:49 ` Peter Grandi 2007-04-07 20:47 ` Martin Steigerwald 2 siblings, 0 replies; 5+ messages in thread From: Peter Grandi @ 2007-04-06 18:49 UTC (permalink / raw) To: Linux XFS >>> On Thu, 5 Apr 2007 11:08:07 +0300, "Zak, Semion" >>> <SZak@nds.com> said: SZak> Hi, We are studying possibility to use XFS with cheap (not SZak> too reliable) discs, so we have some questions: Astute move :-). I hope that you are also thinking of using 16-wide RAID5 too :-). SZak> What in XFS is done to survive the disk errors (bad SZak> sectors)? [ ... ] My impression is that the XFS design is really meant for highly scalable performance on enterprise level hardware, where the block device layer abstracts aways all drive error issues, including having UPSes. Sure you can use it otherwise, but it has a very different optimal usage envelope from 'ext3' or ReiserFS/Reiser4 (which have been designed with stronger resiliency and recoverability features, as they are more oriented to desktop and cheap server usage). Anyhow, a highly reliable block device layer can surely be built out of cheap disks, if one does it right, and people like EMC2 do it regularly with their midrange products. I may be interesting for your to have a look at the disk reliability statistics in some recent papers by some Google and CMU researchers, discussed here: http://swik.net/User:dolander/All+Things+Distributed/On+the+Reliability+of+Hard+Disks/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: XFS Resiliency to the disk errors. 2007-04-05 8:08 XFS Resiliency to the disk errors Zak, Semion 2007-04-05 16:06 ` Eric Sandeen 2007-04-06 18:49 ` Peter Grandi @ 2007-04-07 20:47 ` Martin Steigerwald 2 siblings, 0 replies; 5+ messages in thread From: Martin Steigerwald @ 2007-04-07 20:47 UTC (permalink / raw) To: linux-xfs Am Donnerstag 05 April 2007 schrieb Zak, Semion: > Hi, > > We are studying possibility to use XFS with cheap (not too reliable) > discs, so we have some questions: Hi Semion! I recommend at least monitoring the health status of the drives using smartmontools - with regular short and long selft test - or a similar mechanism. So you *may* at least be warned *before* a disk fails. Otherwise I would go for a redundant RAID array at least so that at least one drive in a bunch of drives can fail without data loss. Regards, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2007-04-10 6:49 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-04-05 8:08 XFS Resiliency to the disk errors Zak, Semion 2007-04-05 16:06 ` Eric Sandeen 2007-04-10 6:49 ` Zak, Semion 2007-04-06 18:49 ` Peter Grandi 2007-04-07 20:47 ` Martin Steigerwald
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox