public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <ricwheeler@gmail.com>
To: Guido Winkelmann <guido@ambient-entertainment.de>, xfs@oss.sgi.com
Subject: Re: Files not touched in weeks got truncated after a crash
Date: Thu, 14 Nov 2013 05:39:35 +0900	[thread overview]
Message-ID: <5283E387.70704@gmail.com> (raw)
In-Reply-To: <2662179.4mj0dgORXu@r008>

You should update your kernel - this sounds like an issue that Dave fixed quite 
a few months back (and got shipped in RHEL and other distros, I don't know about 
when Centos would pick it up)

Ric


On 11/14/2013 01:36 AM, Guido Winkelmann wrote:
> Hi,
>
> We are having some trouble with one of our fileservers using XFS (on linux).
> Yesterday, one of the external RAIDs on the server failed. Of course, it is
> unavoidable that some data would get lost from the fileserver in such an
> event, however, we lost a lot more files than would seem reasonable. In
> particular, we lost a number of files that had not been written to (but had
> been been read from, in some cases) in several weeks.
>
> The data loss manifested itself through files being truncated to length 0 or
> to some other size short of what they should be. (We happen to have an
> external database that keeps track of that.)
>
> The fileserver is based on CentOS 6.3 with kernel version
> 2.6.32-279.9.1.el6.x86_64. It has got several external RAIDs in the 100 TB
> range, connected via FibreChannel.
>
> In case it matters: The server's primary role is as a samba server servicing a
> large number of Windows XP and Windows 7 machines.
>
> We had already been trying to reduce the possible impact of a hardware failure
> by setting a few tunables in /etc/sysctl.conf to try and make the kernel not
> keep dirty buffers around too long:
>
> vm.dirty_background_bytes = 536870912
> vm.dirty_bytes = 134217728
> vm.dirty_writeback_centisecs = 500
> vm.dirty_expire_centisecs = 3000
>
> and by issuing a sync from cron every 15 minutes:
>
> 0,15,30,45 * * * * /bin/sync
>
> Unfortunately, I seem to be unable so far to reproduce the issue on a smaller
> system - and I cannot exactly just walk up to the in-production fileserver and
> rip out yet another array just to see what happens...
>
> This leaves me with a few questions:
>
> Why did we lose so much data through the crash?
>
> Why did not even a sync every 15 minutes prevent further damage?
>
> What can we do to prevent this from happening again in the future?
>
> Regards,
>
> 	Guido
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2013-11-13 20:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-13 16:36 Files not touched in weeks got truncated after a crash Guido Winkelmann
2013-11-13 16:51 ` Roger Willcocks
2013-11-14 10:25   ` Guido Winkelmann
2013-11-13 20:39 ` Ric Wheeler [this message]
2013-11-13 21:43   ` Stefan Ring
2013-11-14 10:11     ` Guido Winkelmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5283E387.70704@gmail.com \
    --to=ricwheeler@gmail.com \
    --cc=guido@ambient-entertainment.de \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox