From: Stan Hoeppner <stan@hardwarefreak.com>
To: xfs@oss.sgi.com
Subject: Re: creating a new 80 TB XFS
Date: Sat, 25 Feb 2012 20:57:05 -0600 [thread overview]
Message-ID: <4F499F81.7080305@hardwarefreak.com> (raw)
In-Reply-To: <20297.22833.759182.360340@tree.ty.sabi.co.UK>
On 2/25/2012 3:57 PM, Peter Grandi wrote:
>> There are always failures. But again, this is a backup system.
>
> Sure, but the last thing you want is for your backup system to
> fail.
Putting an exclamation point on Peter's wisdom requires nothing more
than browsing the list archive:
Subject: xfs_repair of critical volume
Date: Sun, 31 Oct 2010 00:54:13 -0700
To: xfs@oss.sgi.com
I have a large XFS filesystem (60 TB) that is composed of 5 hardware
RAID 6 volumes. One of those volumes had several drives fail in a very
short time and we lost that volume. However, four of the volumes seem
OK. We are in a worse state because our backup unit failed a week later
when four drives simultaneously went offline. So we are in a bad very state.
[...]
This saga is available in these two XFS list threads:
http://oss.sgi.com/archives/xfs/2010-07/msg00077.html
http://oss.sgi.com/archives/xfs/2010-10/msg00373.html
Lessons:
1. Don't use cheap hardware for a backup server
2. Make sure your backup system is reliable
Do test restores operations regularly
I suggest you get the dual active/active controller configuration and
use two PCIe SAS HBAs, one connected to each controller, and use SCSI
multipath. This prevents a dead HBA leaving you dead in the water until
replacement. How long does it take, and at what cost to operations, if
your single HBA fails during a critical restore?
Get the battery backed cache option. Verify the controllers disable the
drive write caches.
Others have recommended stitching 2 small arrays together with mdadm and
using a single XFS on the volume instead of one big array and one XFS.
I suggest using two XFS, one on each small array. This ensures you can
still access some of your backups in the event of a problem with one
array or one filesystem.
As others mentioned, an xfs_[check|repair] can take many hours or even
days on a multi-terabyte huge metadata filesystem. If you need to do a
restore during that period you're out of luck. With two filesystems,
and if duplicating critical images/files on each, you're still in business.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-02-26 2:57 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-24 12:52 creating a new 80 TB XFS Richard Ems
2012-02-24 14:08 ` Emmanuel Florac
2012-02-24 15:43 ` Richard Ems
2012-02-24 16:20 ` Martin Steigerwald
2012-02-24 16:51 ` Stan Hoeppner
2012-02-25 10:59 ` Martin Steigerwald
2012-02-24 16:58 ` Roger Willcocks
2012-02-25 21:57 ` Peter Grandi
2012-02-26 2:57 ` Stan Hoeppner [this message]
2012-02-26 16:08 ` Emmanuel Florac
2012-02-26 16:55 ` Joe Landman
2012-02-24 14:52 ` Peter Grandi
2012-02-24 14:57 ` Michael Weissenbacher
2012-02-24 16:05 ` Richard Ems
2012-02-24 15:17 ` Eric Sandeen
2012-10-01 14:28 ` Richard Ems
2012-10-01 14:36 ` Richard Ems
2012-10-01 14:39 ` Eric Sandeen
2012-10-01 14:45 ` Richard Ems
2012-02-27 11:56 ` Michael Monnerie
2012-02-27 12:20 ` Richard Ems
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F499F81.7080305@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox