All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sam Lang <sam.lang@inktank.com>
To: Nathan Howell <nathan.d.howell@gmail.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: Cephfs losing files and corrupting others
Date: Thu, 01 Nov 2012 17:32:00 -0500	[thread overview]
Message-ID: <5092F860.7040708@inktank.com> (raw)
In-Reply-To: <CAD84eiF1dcE5buvTyFctXTSvB4bgEYXtpxwX7yJWiOEASmNvJA@mail.gmail.com>

On Thu 01 Nov 2012 11:22:59 AM CDT, Nathan Howell wrote:
> We have a small (3 node) Ceph cluster that occasionally has issues. It
> loses files and directories, truncates them or fills the contents with
> NULL bytes. So far we haven't been able to build a repro case but it
> seems to happen when bulk loading data into the cluster, a process
> that is run each evening by a cron job. We've gone about a month
> without any issues but had it happen again yesterday during a larger
> bulk load.  The data is backed up outside of ceph and can be reloaded
> but finding the corrupt files takes quite a while.
>
> Has anyone heard of similar issues before? Should I try upgrading to
> 0.48.2 or a newer kernel?

Hi Nathan,

Do the writes succeed?  I.e. the programs creating the files don't get 
errors back?  Are you seeing any problems with the ceph mds or osd 
processes crashing?  Can you describe your I/O workload during these 
bulk loads?  How many files, how much data, multiple clients writing, 
etc.

As far as I know, there haven't been any fixes to 0.48.2 to resolve 
problems like yours.  You might try the ceph fuse client to see if you 
get the same behavior.  If not, then at least we have narrowed down the 
problem to the ceph kernel client.

Thanks,
-sam

>
> ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
> Linux _ 3.4.4-gentoo #2 SMP Sun Jul 1 18:28:16 UTC 2012 x86_64
> Intel(R) Xeon(R) CPU E31240 @ 3.30GHz GenuineIntel GNU/Linux
>
> I'm using the kernel provided cephfs, mounted with these options:
> 10.0.2.2:6789:/ on /ceph type ceph (rw,noatime,nodiratime)
>
> thanks,
> -n
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



  reply	other threads:[~2012-11-01 22:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-01 16:22 Cephfs losing files and corrupting others Nathan Howell
2012-11-01 22:32 ` Sam Lang [this message]
2012-11-01 23:02   ` Gregory Farnum
2012-11-01 23:30   ` Nathan Howell
2012-11-02  2:37     ` Yan, Zheng 
2012-11-03 16:54     ` Gregory Farnum
     [not found]       ` <CAD84eiEDMXiXf8aFojpAFJPt=5DVZNFbnNq9BnJBxMzRrdNjrw@mail.gmail.com>
2012-11-23  7:37         ` Nathan Howell
2012-11-25 20:45           ` Nathan Howell
2012-12-04 21:57             ` Gregory Farnum
2012-12-05  1:23               ` Gregory Farnum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5092F860.7040708@inktank.com \
    --to=sam.lang@inktank.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=nathan.d.howell@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.