* Cephfs losing files and corrupting others @ 2012-11-01 16:22 Nathan Howell 2012-11-01 22:32 ` Sam Lang 0 siblings, 1 reply; 10+ messages in thread From: Nathan Howell @ 2012-11-01 16:22 UTC (permalink / raw) To: ceph-devel We have a small (3 node) Ceph cluster that occasionally has issues. It loses files and directories, truncates them or fills the contents with NULL bytes. So far we haven't been able to build a repro case but it seems to happen when bulk loading data into the cluster, a process that is run each evening by a cron job. We've gone about a month without any issues but had it happen again yesterday during a larger bulk load. The data is backed up outside of ceph and can be reloaded but finding the corrupt files takes quite a while. Has anyone heard of similar issues before? Should I try upgrading to 0.48.2 or a newer kernel? ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) Linux _ 3.4.4-gentoo #2 SMP Sun Jul 1 18:28:16 UTC 2012 x86_64 Intel(R) Xeon(R) CPU E31240 @ 3.30GHz GenuineIntel GNU/Linux I'm using the kernel provided cephfs, mounted with these options: 10.0.2.2:6789:/ on /ceph type ceph (rw,noatime,nodiratime) thanks, -n ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-01 16:22 Cephfs losing files and corrupting others Nathan Howell @ 2012-11-01 22:32 ` Sam Lang 2012-11-01 23:02 ` Gregory Farnum 2012-11-01 23:30 ` Nathan Howell 0 siblings, 2 replies; 10+ messages in thread From: Sam Lang @ 2012-11-01 22:32 UTC (permalink / raw) To: Nathan Howell; +Cc: ceph-devel On Thu 01 Nov 2012 11:22:59 AM CDT, Nathan Howell wrote: > We have a small (3 node) Ceph cluster that occasionally has issues. It > loses files and directories, truncates them or fills the contents with > NULL bytes. So far we haven't been able to build a repro case but it > seems to happen when bulk loading data into the cluster, a process > that is run each evening by a cron job. We've gone about a month > without any issues but had it happen again yesterday during a larger > bulk load. The data is backed up outside of ceph and can be reloaded > but finding the corrupt files takes quite a while. > > Has anyone heard of similar issues before? Should I try upgrading to > 0.48.2 or a newer kernel? Hi Nathan, Do the writes succeed? I.e. the programs creating the files don't get errors back? Are you seeing any problems with the ceph mds or osd processes crashing? Can you describe your I/O workload during these bulk loads? How many files, how much data, multiple clients writing, etc. As far as I know, there haven't been any fixes to 0.48.2 to resolve problems like yours. You might try the ceph fuse client to see if you get the same behavior. If not, then at least we have narrowed down the problem to the ceph kernel client. Thanks, -sam > > ceph version 0.48.1argonaut (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c) > Linux _ 3.4.4-gentoo #2 SMP Sun Jul 1 18:28:16 UTC 2012 x86_64 > Intel(R) Xeon(R) CPU E31240 @ 3.30GHz GenuineIntel GNU/Linux > > I'm using the kernel provided cephfs, mounted with these options: > 10.0.2.2:6789:/ on /ceph type ceph (rw,noatime,nodiratime) > > thanks, > -n > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-01 22:32 ` Sam Lang @ 2012-11-01 23:02 ` Gregory Farnum 2012-11-01 23:30 ` Nathan Howell 1 sibling, 0 replies; 10+ messages in thread From: Gregory Farnum @ 2012-11-01 23:02 UTC (permalink / raw) To: Nathan Howell, Sam Lang; +Cc: ceph-devel On Thu, Nov 1, 2012 at 11:32 PM, Sam Lang <sam.lang@inktank.com> wrote: > On Thu 01 Nov 2012 11:22:59 AM CDT, Nathan Howell wrote: >> >> We have a small (3 node) Ceph cluster that occasionally has issues. It >> loses files and directories, truncates them or fills the contents with >> NULL bytes. So far we haven't been able to build a repro case but it >> seems to happen when bulk loading data into the cluster, a process >> that is run each evening by a cron job. We've gone about a month >> without any issues but had it happen again yesterday during a larger >> bulk load. The data is backed up outside of ceph and can be reloaded >> but finding the corrupt files takes quite a while. >> >> Has anyone heard of similar issues before? Should I try upgrading to >> 0.48.2 or a newer kernel? > > > Hi Nathan, > > Do the writes succeed? I.e. the programs creating the files don't get > errors back? Are you seeing any problems with the ceph mds or osd processes > crashing? Can you describe your I/O workload during these bulk loads? How > many files, how much data, multiple clients writing, etc. > > As far as I know, there haven't been any fixes to 0.48.2 to resolve problems > like yours. You might try the ceph fuse client to see if you get the same > behavior. If not, then at least we have narrowed down the problem to the > ceph kernel client. Are you using hard links, by any chance? Do you have one or many MDS systems? What filesystem are you using on your OSDs? -Greg ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-01 22:32 ` Sam Lang 2012-11-01 23:02 ` Gregory Farnum @ 2012-11-01 23:30 ` Nathan Howell 2012-11-02 2:37 ` Yan, Zheng 2012-11-03 16:54 ` Gregory Farnum 1 sibling, 2 replies; 10+ messages in thread From: Nathan Howell @ 2012-11-01 23:30 UTC (permalink / raw) To: Sam Lang, Gregory Farnum; +Cc: ceph-devel On Thu, Nov 1, 2012 at 3:32 PM, Sam Lang <sam.lang@inktank.com> wrote: > Do the writes succeed? I.e. the programs creating the files don't get > errors back? Are you seeing any problems with the ceph mds or osd processes > crashing? Can you describe your I/O workload during these bulk loads? How > many files, how much data, multiple clients writing, etc. > > As far as I know, there haven't been any fixes to 0.48.2 to resolve problems > like yours. You might try the ceph fuse client to see if you get the same > behavior. If not, then at least we have narrowed down the problem to the > ceph kernel client. Yes, the writes succeed. Wednesday's failure looked like this: 1) rsync 100-200mb tarball directly into ceph from a remote site 2) untar ~500 files from tarball in ceph into a new directory in ceph 3) wait for a while 4) the .tar file and some log files disappeared but the untarred files were fine Total filesystem size is: pgmap v2221244: 960 pgs: 960 active+clean; 2418 GB data, 7293 GB used, 6151 GB / 13972 GB avail Generally our load looks like: Constant trickle of 1-2mb files from 3 machines, about 1GB per day total. No file is written to by more than 1 machine, but the files go into shared directories. Grid jobs are running constantly and are doing sequential reads from the filesystem. Compute nodes have the filesystem mounted read-only. They're primarily located at a remote site (~40ms away) and tend to average 1-2 megabits/sec. Nightly data jobs load in ~10GB from a few remote sites in to <10 large files. These are split up into about 1000 smaller files but the originals are also kept. All of this is done on one machine. The journals and osd drives are write saturated while this is going on. On Thu, Nov 1, 2012 at 4:02 PM, Gregory Farnum <greg@inktank.com> wrote: > Are you using hard links, by any chance? No, we are using a handfull of soft links though. > Do you have one or many MDS systems? ceph mds stat says: e686: 1/1/1 up {0=xxx=up:active}, 2 up:standby > What filesystem are you using on your OSDs? btrfs thanks, -n ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-01 23:30 ` Nathan Howell @ 2012-11-02 2:37 ` Yan, Zheng 2012-11-03 16:54 ` Gregory Farnum 1 sibling, 0 replies; 10+ messages in thread From: Yan, Zheng @ 2012-11-02 2:37 UTC (permalink / raw) To: Nathan Howell; +Cc: Sam Lang, Gregory Farnum, ceph-devel On Fri, Nov 2, 2012 at 7:30 AM, Nathan Howell <nathan.d.howell@gmail.com> wrote: > On Thu, Nov 1, 2012 at 3:32 PM, Sam Lang <sam.lang@inktank.com> wrote: >> Do the writes succeed? I.e. the programs creating the files don't get >> errors back? Are you seeing any problems with the ceph mds or osd processes >> crashing? Can you describe your I/O workload during these bulk loads? How >> many files, how much data, multiple clients writing, etc. >> >> As far as I know, there haven't been any fixes to 0.48.2 to resolve problems >> like yours. You might try the ceph fuse client to see if you get the same >> behavior. If not, then at least we have narrowed down the problem to the >> ceph kernel client. > > Yes, the writes succeed. Wednesday's failure looked like this: > > 1) rsync 100-200mb tarball directly into ceph from a remote site > 2) untar ~500 files from tarball in ceph into a new directory in ceph > 3) wait for a while > 4) the .tar file and some log files disappeared but the untarred files were fine > > Total filesystem size is: > > pgmap v2221244: 960 pgs: 960 active+clean; 2418 GB data, 7293 GB used, > 6151 GB / 13972 GB avail > > Generally our load looks like: > > Constant trickle of 1-2mb files from 3 machines, about 1GB per day > total. No file is written to by more than 1 machine, but the files go > into shared directories. > > Grid jobs are running constantly and are doing sequential reads from > the filesystem. Compute nodes have the filesystem mounted read-only. > They're primarily located at a remote site (~40ms away) and tend to > average 1-2 megabits/sec. > > Nightly data jobs load in ~10GB from a few remote sites in to <10 > large files. These are split up into about 1000 smaller files but the > originals are also kept. All of this is done on one machine. The > journals and osd drives are write saturated while this is going on. > > > On Thu, Nov 1, 2012 at 4:02 PM, Gregory Farnum <greg@inktank.com> wrote: >> Are you using hard links, by any chance? > > No, we are using a handfull of soft links though. > > >> Do you have one or many MDS systems? > > ceph mds stat says: e686: 1/1/1 up {0=xxx=up:active}, 2 up:standby > > >> What filesystem are you using on your OSDs? > > btrfs > > my recent patch ''ceph: Fix i_size update race" probably can fix the truncated file issue. Yan, Zheng ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-01 23:30 ` Nathan Howell 2012-11-02 2:37 ` Yan, Zheng @ 2012-11-03 16:54 ` Gregory Farnum [not found] ` <CAD84eiEDMXiXf8aFojpAFJPt=5DVZNFbnNq9BnJBxMzRrdNjrw@mail.gmail.com> 1 sibling, 1 reply; 10+ messages in thread From: Gregory Farnum @ 2012-11-03 16:54 UTC (permalink / raw) To: Nathan Howell, Samuel Just; +Cc: Sam Lang, ceph-devel On Fri, Nov 2, 2012 at 12:30 AM, Nathan Howell <nathan.d.howell@gmail.com> wrote: > On Thu, Nov 1, 2012 at 3:32 PM, Sam Lang <sam.lang@inktank.com> wrote: >> Do the writes succeed? I.e. the programs creating the files don't get >> errors back? Are you seeing any problems with the ceph mds or osd processes >> crashing? Can you describe your I/O workload during these bulk loads? How >> many files, how much data, multiple clients writing, etc. >> >> As far as I know, there haven't been any fixes to 0.48.2 to resolve problems >> like yours. You might try the ceph fuse client to see if you get the same >> behavior. If not, then at least we have narrowed down the problem to the >> ceph kernel client. > > Yes, the writes succeed. Wednesday's failure looked like this: > > 1) rsync 100-200mb tarball directly into ceph from a remote site > 2) untar ~500 files from tarball in ceph into a new directory in ceph > 3) wait for a while > 4) the .tar file and some log files disappeared but the untarred files were fine Just to be clear, you copied a tarball into Ceph and untarred all in Ceph, and the extracted contents were fine but the tarball disappeared? So this looks like a case of successfully-written files disappearing? Did you at any point check the tarball from a machine other than the initial client that copied it in? This truncation sounds like maybe Yan's fix will deal with it. But if you've also seen files with the proper size but be empty or corrupted, that sounds like an OSD bug. Sam, are you aware of any btrfs issues that could cause this? Nathan, you've also seen parts of the filesystem hierarchy get lost? That's rather more concerning; under what circumstances have you seen that? -Greg > Total filesystem size is: > > pgmap v2221244: 960 pgs: 960 active+clean; 2418 GB data, 7293 GB used, > 6151 GB / 13972 GB avail > > Generally our load looks like: > > Constant trickle of 1-2mb files from 3 machines, about 1GB per day > total. No file is written to by more than 1 machine, but the files go > into shared directories. > > Grid jobs are running constantly and are doing sequential reads from > the filesystem. Compute nodes have the filesystem mounted read-only. > They're primarily located at a remote site (~40ms away) and tend to > average 1-2 megabits/sec. > > Nightly data jobs load in ~10GB from a few remote sites in to <10 > large files. These are split up into about 1000 smaller files but the > originals are also kept. All of this is done on one machine. The > journals and osd drives are write saturated while this is going on. > > > On Thu, Nov 1, 2012 at 4:02 PM, Gregory Farnum <greg@inktank.com> wrote: >> Are you using hard links, by any chance? > > No, we are using a handfull of soft links though. > > >> Do you have one or many MDS systems? > > ceph mds stat says: e686: 1/1/1 up {0=xxx=up:active}, 2 up:standby > > >> What filesystem are you using on your OSDs? > > btrfs > > > thanks, > -n ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <CAD84eiEDMXiXf8aFojpAFJPt=5DVZNFbnNq9BnJBxMzRrdNjrw@mail.gmail.com>]
* Re: Cephfs losing files and corrupting others [not found] ` <CAD84eiEDMXiXf8aFojpAFJPt=5DVZNFbnNq9BnJBxMzRrdNjrw@mail.gmail.com> @ 2012-11-23 7:37 ` Nathan Howell 2012-11-25 20:45 ` Nathan Howell 0 siblings, 1 reply; 10+ messages in thread From: Nathan Howell @ 2012-11-23 7:37 UTC (permalink / raw) To: Gregory Farnum; +Cc: Samuel Just, Sam Lang, ceph-devel I upgraded to 0.54 and now there are some hints in the logs. The directories referenced in the log entries are now missing: 2012-11-23 07:28:04.802864 mds.0 [ERR] loaded dup inode 1000000662f [2,head] v3851654 at /xxx/20120203, but inode 1000000662f.head v3853093 already exists at ~mds0/stray7/1000000662f 2012-11-23 07:28:04.802889 mds.0 [ERR] loaded dup inode 10000003a4b [2,head] v431518 at /xxx/20120206, but inode 10000003a4b.head v3853192 already exists at ~mds0/stray8/10000003a4b 2012-11-23 07:28:04.802909 mds.0 [ERR] loaded dup inode 1000000149e [2,head] v431522 at /xxx/20120207, but inode 1000000149e.head v3853206 already exists at ~mds0/stray8/1000000149e 2012-11-23 07:28:04.802927 mds.0 [ERR] loaded dup inode 10000000a5f [2,head] v431526 at /xxx/20120208, but inode 10000000a5f.head v3853208 already exists at ~mds0/stray8/10000000a5f Any ideas? On Thu, Nov 15, 2012 at 11:00 AM, Nathan Howell <nathan.d.howell@gmail.com> wrote: > Yes, successfully written files were disappearing. We switched to ceph-fuse > and haven't seen any files truncated since. Older files (written months ago) > are still having their entire contents replaced with NULL bytes, seemly at > random. I can't yet say for sure this has happened since switching over to > fuse... but we think it has. > > I'm going to test all of the archives over the next few days and restore > them from S3, so we should be back in a known-good state after that. In the > event more files end up corrupted, is there any logging that I can enable > that would help track down the problem? > > thanks, > -n > > > On Sat, Nov 3, 2012 at 9:54 AM, Gregory Farnum <greg@inktank.com> wrote: >> >> On Fri, Nov 2, 2012 at 12:30 AM, Nathan Howell >> <nathan.d.howell@gmail.com> wrote: >> > On Thu, Nov 1, 2012 at 3:32 PM, Sam Lang <sam.lang@inktank.com> wrote: >> >> Do the writes succeed? I.e. the programs creating the files don't get >> >> errors back? Are you seeing any problems with the ceph mds or osd >> >> processes >> >> crashing? Can you describe your I/O workload during these bulk loads? >> >> How >> >> many files, how much data, multiple clients writing, etc. >> >> >> >> As far as I know, there haven't been any fixes to 0.48.2 to resolve >> >> problems >> >> like yours. You might try the ceph fuse client to see if you get the >> >> same >> >> behavior. If not, then at least we have narrowed down the problem to >> >> the >> >> ceph kernel client. >> > >> > Yes, the writes succeed. Wednesday's failure looked like this: >> > >> > 1) rsync 100-200mb tarball directly into ceph from a remote site >> > 2) untar ~500 files from tarball in ceph into a new directory in ceph >> > 3) wait for a while >> > 4) the .tar file and some log files disappeared but the untarred files >> > were fine >> >> Just to be clear, you copied a tarball into Ceph and untarred all in >> Ceph, and the extracted contents were fine but the tarball >> disappeared? So this looks like a case of successfully-written files >> disappearing? >> Did you at any point check the tarball from a machine other than the >> initial client that copied it in? >> >> This truncation sounds like maybe Yan's fix will deal with it. But if >> you've also seen files with the proper size but be empty or corrupted, >> that sounds like an OSD bug. Sam, are you aware of any btrfs issues >> that could cause this? >> >> Nathan, you've also seen parts of the filesystem hierarchy get lost? >> That's rather more concerning; under what circumstances have you seen >> that? >> -Greg >> >> > Total filesystem size is: >> > >> > pgmap v2221244: 960 pgs: 960 active+clean; 2418 GB data, 7293 GB used, >> > 6151 GB / 13972 GB avail >> > >> > Generally our load looks like: >> > >> > Constant trickle of 1-2mb files from 3 machines, about 1GB per day >> > total. No file is written to by more than 1 machine, but the files go >> > into shared directories. >> > >> > Grid jobs are running constantly and are doing sequential reads from >> > the filesystem. Compute nodes have the filesystem mounted read-only. >> > They're primarily located at a remote site (~40ms away) and tend to >> > average 1-2 megabits/sec. >> > >> > Nightly data jobs load in ~10GB from a few remote sites in to <10 >> > large files. These are split up into about 1000 smaller files but the >> > originals are also kept. All of this is done on one machine. The >> > journals and osd drives are write saturated while this is going on. >> > >> > >> > On Thu, Nov 1, 2012 at 4:02 PM, Gregory Farnum <greg@inktank.com> wrote: >> >> Are you using hard links, by any chance? >> > >> > No, we are using a handfull of soft links though. >> > >> > >> >> Do you have one or many MDS systems? >> > >> > ceph mds stat says: e686: 1/1/1 up {0=xxx=up:active}, 2 up:standby >> > >> > >> >> What filesystem are you using on your OSDs? >> > >> > btrfs >> > >> > >> > thanks, >> > -n > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-23 7:37 ` Nathan Howell @ 2012-11-25 20:45 ` Nathan Howell 2012-12-04 21:57 ` Gregory Farnum 0 siblings, 1 reply; 10+ messages in thread From: Nathan Howell @ 2012-11-25 20:45 UTC (permalink / raw) To: Gregory Farnum, ceph-devel; +Cc: Samuel Just, Sam Lang So when trawling through the filesystem doing checksum validation these popped up on the files that are filled with null bytes: https://gist.github.com/186ad4c5df816d44f909 Is there any way to fsck today? Looks like feature #86 http://tracker.newdream.net/issues/86 isn't implemented yet. thanks, -n On Thu, Nov 22, 2012 at 11:37 PM, Nathan Howell <nathan.d.howell@gmail.com> wrote: > I upgraded to 0.54 and now there are some hints in the logs. The > directories referenced in the log entries are now missing: > > 2012-11-23 07:28:04.802864 mds.0 [ERR] loaded dup inode 1000000662f > [2,head] v3851654 at /xxx/20120203, but inode 1000000662f.head > v3853093 already exists at ~mds0/stray7/1000000662f > 2012-11-23 07:28:04.802889 mds.0 [ERR] loaded dup inode 10000003a4b > [2,head] v431518 at /xxx/20120206, but inode 10000003a4b.head v3853192 > already exists at ~mds0/stray8/10000003a4b > 2012-11-23 07:28:04.802909 mds.0 [ERR] loaded dup inode 1000000149e > [2,head] v431522 at /xxx/20120207, but inode 1000000149e.head v3853206 > already exists at ~mds0/stray8/1000000149e > 2012-11-23 07:28:04.802927 mds.0 [ERR] loaded dup inode 10000000a5f > [2,head] v431526 at /xxx/20120208, but inode 10000000a5f.head v3853208 > already exists at ~mds0/stray8/10000000a5f > > Any ideas? > > On Thu, Nov 15, 2012 at 11:00 AM, Nathan Howell > <nathan.d.howell@gmail.com> wrote: >> Yes, successfully written files were disappearing. We switched to ceph-fuse >> and haven't seen any files truncated since. Older files (written months ago) >> are still having their entire contents replaced with NULL bytes, seemly at >> random. I can't yet say for sure this has happened since switching over to >> fuse... but we think it has. >> >> I'm going to test all of the archives over the next few days and restore >> them from S3, so we should be back in a known-good state after that. In the >> event more files end up corrupted, is there any logging that I can enable >> that would help track down the problem? >> >> thanks, >> -n >> >> >> On Sat, Nov 3, 2012 at 9:54 AM, Gregory Farnum <greg@inktank.com> wrote: >>> >>> On Fri, Nov 2, 2012 at 12:30 AM, Nathan Howell >>> <nathan.d.howell@gmail.com> wrote: >>> > On Thu, Nov 1, 2012 at 3:32 PM, Sam Lang <sam.lang@inktank.com> wrote: >>> >> Do the writes succeed? I.e. the programs creating the files don't get >>> >> errors back? Are you seeing any problems with the ceph mds or osd >>> >> processes >>> >> crashing? Can you describe your I/O workload during these bulk loads? >>> >> How >>> >> many files, how much data, multiple clients writing, etc. >>> >> >>> >> As far as I know, there haven't been any fixes to 0.48.2 to resolve >>> >> problems >>> >> like yours. You might try the ceph fuse client to see if you get the >>> >> same >>> >> behavior. If not, then at least we have narrowed down the problem to >>> >> the >>> >> ceph kernel client. >>> > >>> > Yes, the writes succeed. Wednesday's failure looked like this: >>> > >>> > 1) rsync 100-200mb tarball directly into ceph from a remote site >>> > 2) untar ~500 files from tarball in ceph into a new directory in ceph >>> > 3) wait for a while >>> > 4) the .tar file and some log files disappeared but the untarred files >>> > were fine >>> >>> Just to be clear, you copied a tarball into Ceph and untarred all in >>> Ceph, and the extracted contents were fine but the tarball >>> disappeared? So this looks like a case of successfully-written files >>> disappearing? >>> Did you at any point check the tarball from a machine other than the >>> initial client that copied it in? >>> >>> This truncation sounds like maybe Yan's fix will deal with it. But if >>> you've also seen files with the proper size but be empty or corrupted, >>> that sounds like an OSD bug. Sam, are you aware of any btrfs issues >>> that could cause this? >>> >>> Nathan, you've also seen parts of the filesystem hierarchy get lost? >>> That's rather more concerning; under what circumstances have you seen >>> that? >>> -Greg >>> >>> > Total filesystem size is: >>> > >>> > pgmap v2221244: 960 pgs: 960 active+clean; 2418 GB data, 7293 GB used, >>> > 6151 GB / 13972 GB avail >>> > >>> > Generally our load looks like: >>> > >>> > Constant trickle of 1-2mb files from 3 machines, about 1GB per day >>> > total. No file is written to by more than 1 machine, but the files go >>> > into shared directories. >>> > >>> > Grid jobs are running constantly and are doing sequential reads from >>> > the filesystem. Compute nodes have the filesystem mounted read-only. >>> > They're primarily located at a remote site (~40ms away) and tend to >>> > average 1-2 megabits/sec. >>> > >>> > Nightly data jobs load in ~10GB from a few remote sites in to <10 >>> > large files. These are split up into about 1000 smaller files but the >>> > originals are also kept. All of this is done on one machine. The >>> > journals and osd drives are write saturated while this is going on. >>> > >>> > >>> > On Thu, Nov 1, 2012 at 4:02 PM, Gregory Farnum <greg@inktank.com> wrote: >>> >> Are you using hard links, by any chance? >>> > >>> > No, we are using a handfull of soft links though. >>> > >>> > >>> >> Do you have one or many MDS systems? >>> > >>> > ceph mds stat says: e686: 1/1/1 up {0=xxx=up:active}, 2 up:standby >>> > >>> > >>> >> What filesystem are you using on your OSDs? >>> > >>> > btrfs >>> > >>> > >>> > thanks, >>> > -n >> >> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-11-25 20:45 ` Nathan Howell @ 2012-12-04 21:57 ` Gregory Farnum 2012-12-05 1:23 ` Gregory Farnum 0 siblings, 1 reply; 10+ messages in thread From: Gregory Farnum @ 2012-12-04 21:57 UTC (permalink / raw) To: Nathan Howell; +Cc: ceph-devel@vger.kernel.org, Samuel Just, Sam Lang On Sun, Nov 25, 2012 at 12:45 PM, Nathan Howell <nathan.d.howell@gmail.com> wrote: > So when trawling through the filesystem doing checksum validation > these popped up on the files that are filled with null bytes: > https://gist.github.com/186ad4c5df816d44f909 > > Is there any way to fsck today? Looks like feature #86 > http://tracker.newdream.net/issues/86 isn't implemented yet. Yeah, unfortunately there isn't — fsck is one of those things that we want to do as we prepare CephFS for production use, but we're only now starting to move back in that direction. The error printouts you're seeing indicate that...actually, I don't know what they mean in this context. Hrm. In any case, Zheng Yan contributed some patches that could impact a number of these issues, but I still don't see how the NULL bytes could enter into it from our end. If you can afford the disk space required to turn on "debug osd = 10" on the OSDs, and "debug mds = 10" on the MDS, that might give us a clue about what's going on, if we manage to grab the logs that overlap with the bad event (or at least the detection of it). You'll certainly want to enable log rotation, though — that will generate some very large logs. Sorry for the slow turnaround time on this, our attention is being pulled in a lot of directions besides CephFS and this is going to be a hard one. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Cephfs losing files and corrupting others 2012-12-04 21:57 ` Gregory Farnum @ 2012-12-05 1:23 ` Gregory Farnum 0 siblings, 0 replies; 10+ messages in thread From: Gregory Farnum @ 2012-12-05 1:23 UTC (permalink / raw) To: Nathan Howell; +Cc: ceph-devel@vger.kernel.org, Samuel Just, Sam Lang On Tue, Dec 4, 2012 at 1:57 PM, Gregory Farnum <greg@inktank.com> wrote: > On Sun, Nov 25, 2012 at 12:45 PM, Nathan Howell > <nathan.d.howell@gmail.com> wrote: >> So when trawling through the filesystem doing checksum validation >> these popped up on the files that are filled with null bytes: >> https://gist.github.com/186ad4c5df816d44f909 >> >> Is there any way to fsck today? Looks like feature #86 >> http://tracker.newdream.net/issues/86 isn't implemented yet. > > Yeah, unfortunately there isn't — fsck is one of those things that we > want to do as we prepare CephFS for production use, but we're only now > starting to move back in that direction. > > The error printouts you're seeing indicate that...actually, I don't > know what they mean in this context. Hrm. In any case, Zheng Yan > contributed some patches that could impact a number of these issues, > but I still don't see how the NULL bytes could enter into it from our > end. Oooh, actually, Zheng's patches are definitely related to this issue. If you can try the "next" branch, that might resolve it going forward (it won't repair current damage, though). -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-12-05 1:23 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-01 16:22 Cephfs losing files and corrupting others Nathan Howell
2012-11-01 22:32 ` Sam Lang
2012-11-01 23:02 ` Gregory Farnum
2012-11-01 23:30 ` Nathan Howell
2012-11-02 2:37 ` Yan, Zheng
2012-11-03 16:54 ` Gregory Farnum
[not found] ` <CAD84eiEDMXiXf8aFojpAFJPt=5DVZNFbnNq9BnJBxMzRrdNjrw@mail.gmail.com>
2012-11-23 7:37 ` Nathan Howell
2012-11-25 20:45 ` Nathan Howell
2012-12-04 21:57 ` Gregory Farnum
2012-12-05 1:23 ` Gregory Farnum
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.