From: Ric Wheeler <rwheeler@redhat.com>
To: ceph-users@lists.ceph.com, Linux fs XFS <xfs@oss.sgi.com>
Subject: Re: [ceph-users] xfs corruption, data disaster!
Date: Mon, 11 May 2015 17:47:59 +0300 [thread overview]
Message-ID: <5550C11F.9090807@redhat.com> (raw)
In-Reply-To: <loom.20150505T030824-422@post.gmane.org>
On 05/05/2015 04:13 AM, Yujian Peng wrote:
> Emmanuel Florac <eflorac@...> writes:
>
>> Le Mon, 4 May 2015 07:00:32 +0000 (UTC)
>> Yujian Peng <pengyujian5201314 <at> 126.com> écrivait:
>>
>>> I'm encountering a data disaster. I have a ceph cluster with 145 osd.
>>> The data center had a power problem yesterday, and all of the ceph
>>> nodes were down. But now I find that 6 disks(xfs) in 4 nodes have
>>> data corruption. Some disks are unable to mount, and some disks have
>>> IO errors in syslog. mount: Structure needs cleaning
>>> xfs_log_forece: error 5 returned
>>> I tried to repair one with xfs_repair -L /dev/sdx1, but the ceph-osd
>>> reported a leveldb error:
>>> Error initializing leveldb: Corruption: checksum mismatch
>>> I cannot start the 6 osds and 22 pgs is down.
>>> This is really a tragedy for me. Can you give me some idea to
>>> recovery the xfs? Thanks very much!
>> For XFS problems, ask the XFS ML: xfs <at> oss.sgi.com
>>
>> You didn't give enough details, by far. What version of kernel and
>> distro are you running? If there were errors, please post extensive
>> logs. If you have IO errors on some disks, you probably MUST replace
>> them before going any further.
>>
>> Why did you run xfs_repair -L ? Did you try xfs_repair without options
>> first? Were you running the very very latest version of xfs_repair
>> (3.2.2) ?
>>
> The OS is ubuntu 12.04.5 with kernel 3.13.0
> uname -a
> Linux ceph19 3.13.0-32-generic #57~precise1-Ubuntu SMP Tue Jul 15 03:51:20
> UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> cat /etc/issue
> Ubuntu 12.04.5 LTS \n \l
> xfs_repair -V
> xfs_repair version 3.1.7
> I've tried xfs_repair without options, but it showed me some errors, so I
> used the -L option.
> Thanks for your reply!
>
Responding quickly to a couple of things:
* xfs_repair -L wipes out the XFS log, not normally a good thing to do
* replacing disks with IO errors is not a great idea if you still need that
data. You might want to copy the data from that disk to a new disk (same or
greater size) and then try to repair that new disk. A lot depends on the type
of IO error you see - you might have cable issues, HBA issues, or fairly normal
read issues (which are not worth replacing a disk for).
You should work with your vendor's support team if you have a support contract
or post the the XFS devel list (copied above) for help.
Good luck!
Ric
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next parent reply other threads:[~2015-05-11 14:48 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <loom.20150504T085721-88@post.gmane.org>
[not found] ` <20150504161912.6ff8621b@harpe.intellique.com>
[not found] ` <loom.20150505T030824-422@post.gmane.org>
2015-05-11 14:47 ` Ric Wheeler [this message]
2015-05-11 14:54 ` [ceph-users] xfs corruption, data disaster! Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5550C11F.9090807@redhat.com \
--to=rwheeler@redhat.com \
--cc=ceph-users@lists.ceph.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.