* XFS: how to NOT null files on fsck? @ 2004-07-05 5:47 Norberto Bensa 2004-07-09 16:37 ` L A Walsh 0 siblings, 1 reply; 52+ messages in thread From: Norberto Bensa @ 2004-07-05 5:47 UTC (permalink / raw) To: linux-kernel Hello, how do I setup XFS to not null files after a bad shutdown? Thanks, Norberto ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-05 5:47 XFS: how to NOT null files on fsck? Norberto Bensa @ 2004-07-09 16:37 ` L A Walsh 2004-07-09 21:59 ` Chris Wedgwood 2004-07-29 1:30 ` Nathan Scott 0 siblings, 2 replies; 52+ messages in thread From: L A Walsh @ 2004-07-09 16:37 UTC (permalink / raw) To: Norberto Bensa; +Cc: linux-kernel It's a feature! :-) It's been in the code for years to randomly write nulls to some files that have been modified in the past few days after a bad shutdown. Reported on XFS list and got same overwhelming response there. Apparently not easily reproduced, no one has a clue why it does it. Just does. Even after multiple syncs, files edited within the past few days will sometimes go mysteriously null. Good reason to do daily backups as the backups will usually contain the correct file... Now if we could just come up with a reproducable test case...but when I try to reproduce it, it doesn't. Grrr....it knows when I'm scrutinizing!! :-) -l Norberto Bensa wrote: >Hello, > >how do I setup XFS to not null files after a bad shutdown? > >Thanks, >Norberto >- >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html >Please read the FAQ at http://www.tux.org/lkml/ > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-09 16:37 ` L A Walsh @ 2004-07-09 21:59 ` Chris Wedgwood 2004-07-10 18:33 ` L A Walsh 2004-07-10 18:43 ` Jan Knutar 2004-07-29 1:30 ` Nathan Scott 1 sibling, 2 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-09 21:59 UTC (permalink / raw) To: L A Walsh; +Cc: Norberto Bensa, linux-kernel On Fri, Jul 09, 2004 at 09:37:48AM -0700, L A Walsh wrote: > Even after multiple syncs, files edited within the past few days > will sometimes go mysteriously null. Good reason to do daily > backups as the backups will usually contain the correct file... I *never* see this even when beating the hell out of machines and trying to break things. I do see nulls in cases where the metadata was updated and the data didn't flush, that's supposed to happen. > Now if we could just come up with a reproducable test case...but > when I try to reproduce it, it doesn't. Grrr....it knows when I'm > scrutinizing!! :-) Use anything that handles dotfiles or configuration badly (ie. KDE), make some changes or just 'run it' for a bit. Every now something rewrites some files. Yank the power a few times and sooner or later you'll end up with problems under KDE certainly. Sane applications (MTAs like postfix for example) don't have this problem because they were written with more clue. If they did have this problem people would scream, because mail would get lost... and large mail servers might have tens of thousands of files moving about in-flight, much more strenuous that a few dot-files. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-09 21:59 ` Chris Wedgwood @ 2004-07-10 18:33 ` L A Walsh 2004-07-10 18:43 ` Chris Wedgwood 2004-07-12 23:03 ` Bernd Eckenfels 2004-07-10 18:43 ` Jan Knutar 1 sibling, 2 replies; 52+ messages in thread From: L A Walsh @ 2004-07-10 18:33 UTC (permalink / raw) To: Chris Wedgwood; +Cc: L A Walsh, Norberto Bensa, linux-kernel My cases have been "vim" edited files. I'd sorta think once vim has exited, the data has been flushed, but that's just a WAG... -l Chris Wedgwood wrote: >On Fri, Jul 09, 2004 at 09:37:48AM -0700, L A Walsh wrote: > >>ven after multiple syncs, files edited within the past few days >>will sometimes go mysteriously null. Good reason to do daily >>backups as the backups will usually contain the correct file... >> >> >I *never* see this even when beating the hell out of machines and >trying to break things. > >I do see nulls in cases where the metadata was updated and the data >didn't flush, that's supposed to happen. > > >>Now if we could just come up with a reproducable test case...but >>when I try to reproduce it, it doesn't. Grrr....it knows when I'm >>scrutinizing!! :-) >> >> >Use anything that handles dotfiles or configuration badly (ie. KDE), >make some changes or just 'run it' for a bit. Every now something >rewrites some files. Yank the power a few times and sooner or later >you'll end up with problems under KDE certainly. > > --- No desktop on this machine...it's a server I log into remotely for the most part. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:33 ` L A Walsh @ 2004-07-10 18:43 ` Chris Wedgwood 2004-07-10 21:24 ` Bernd Eckenfels 2004-07-12 23:03 ` Bernd Eckenfels 1 sibling, 1 reply; 52+ messages in thread From: Chris Wedgwood @ 2004-07-10 18:43 UTC (permalink / raw) To: L A Walsh; +Cc: Norberto Bensa, linux-kernel On Sat, Jul 10, 2004 at 11:33:09AM -0700, L A Walsh wrote: > My cases have been "vim" edited files. I'd sorta think once vim has > exited, the data has been flushed, but that's just a WAG... No, that's not the case. Normally when files are written the data isn't not flushed immediately, it sits in memory (the page-cache) for some (usually) small amount of time. If the data is critical applications should fsync (or similar) as required. FWIW my standard method of shutdown is: sync ; poweroff -f sorta thing. I don't loose any data doing this, (at least nothing I've noticed). --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:43 ` Chris Wedgwood @ 2004-07-10 21:24 ` Bernd Eckenfels 2004-07-11 21:54 ` Helge Hafting 0 siblings, 1 reply; 52+ messages in thread From: Bernd Eckenfels @ 2004-07-10 21:24 UTC (permalink / raw) To: linux-kernel In article <20040710184357.GA5014@taniwha.stupidest.org> you wrote: > No, that's not the case. Normally when files are written the data > isn't not flushed immediately, it sits in memory (the page-cache) for > some (usually) small amount of time. Does that mean, that closing a tempfile and then renaming the file is not a reliable way to tell, that the data is persited? I usually use a atomic rename to have a point from which on I can tell if the data is complete and persisted. I thought close() has fsync() semantics? Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 21:24 ` Bernd Eckenfels @ 2004-07-11 21:54 ` Helge Hafting 2004-07-12 17:56 ` H. Peter Anvin 0 siblings, 1 reply; 52+ messages in thread From: Helge Hafting @ 2004-07-11 21:54 UTC (permalink / raw) To: Bernd Eckenfels; +Cc: linux-kernel On Sat, Jul 10, 2004 at 11:24:53PM +0200, Bernd Eckenfels wrote: > In article <20040710184357.GA5014@taniwha.stupidest.org> you wrote: > > No, that's not the case. Normally when files are written the data > > isn't not flushed immediately, it sits in memory (the page-cache) for > > some (usually) small amount of time. > > Does that mean, that closing a tempfile and then renaming the file is not > a reliable way to tell, that the data is persited? I usually use a atomic > rename to have a point from which on I can tell if the data is complete > and persisted. > > I thought close() has fsync() semantics? > No, it doesn't. close() will flush the C library buffer. That means the data moves from theose buffers to the pagacache. The program crashing after that will have no effect on the file. It can still be lost if the _kernel_ crashes though. If you want the pagecache flushed to disk, use fsync (or sync) Helge Hafting ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-11 21:54 ` Helge Hafting @ 2004-07-12 17:56 ` H. Peter Anvin 2004-07-12 19:59 ` Chris Wedgwood 0 siblings, 1 reply; 52+ messages in thread From: H. Peter Anvin @ 2004-07-12 17:56 UTC (permalink / raw) To: linux-kernel Followup to: <20040711215446.GA21443@hh.idb.hist.no> By author: Helge Hafting <helgehaf@aitel.hist.no> In newsgroup: linux.dev.kernel > > > No, it doesn't. > > close() will flush the C library buffer. That means the data > moves from theose buffers to the pagacache. The program crashing > after that will have no effect on the file. It can still > be lost if the _kernel_ crashes though. > If you want the pagecache flushed to disk, use fsync (or sync) > No it won't, since if you're using file descriptors there *is* no C library buffer. fclose() will, though, and then call close(). -hpa ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 17:56 ` H. Peter Anvin @ 2004-07-12 19:59 ` Chris Wedgwood 2004-07-12 20:32 ` H. Peter Anvin 2004-07-12 22:29 ` Bernd Eckenfels 0 siblings, 2 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-12 19:59 UTC (permalink / raw) To: H. Peter Anvin; +Cc: linux-kernel On Mon, Jul 12, 2004 at 05:56:11PM +0000, H. Peter Anvin wrote: > No it won't, since if you're using file descriptors there *is* no C > library buffer. fclose() will, though, and then call close(). Data sits in the page-cache though, and if you loose power before that's flushed you will loose data. This is why fsync is needed to be sure. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 19:59 ` Chris Wedgwood @ 2004-07-12 20:32 ` H. Peter Anvin 2004-07-12 22:29 ` Bernd Eckenfels 1 sibling, 0 replies; 52+ messages in thread From: H. Peter Anvin @ 2004-07-12 20:32 UTC (permalink / raw) To: Chris Wedgwood; +Cc: linux-kernel Chris Wedgwood wrote: > On Mon, Jul 12, 2004 at 05:56:11PM +0000, H. Peter Anvin wrote: > > >>No it won't, since if you're using file descriptors there *is* no C >>library buffer. fclose() will, though, and then call close(). > > > Data sits in the page-cache though, and if you loose power before > that's flushed you will loose data. This is why fsync is needed to be > sure. > Correct. -hpa ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 19:59 ` Chris Wedgwood 2004-07-12 20:32 ` H. Peter Anvin @ 2004-07-12 22:29 ` Bernd Eckenfels 1 sibling, 0 replies; 52+ messages in thread From: Bernd Eckenfels @ 2004-07-12 22:29 UTC (permalink / raw) To: linux-kernel In article <20040712195956.GA14105@taniwha.stupidest.org> you wrote: > Data sits in the page-cache though, and if you loose power before > that's flushed you will loose data. This is why fsync is needed to be > sure. Yes right, I was confusing that with networked filesystems with commit-on-close semantics. Greetings Bernd BTW: I was stracing java, and it is enough to do "fos.getFD().sync(); fos.close()" on FileOutputStrea to get a fsync(fd) followed by close(fd). -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:33 ` L A Walsh 2004-07-10 18:43 ` Chris Wedgwood @ 2004-07-12 23:03 ` Bernd Eckenfels 2004-07-12 23:14 ` Chris Wedgwood 1 sibling, 1 reply; 52+ messages in thread From: Bernd Eckenfels @ 2004-07-12 23:03 UTC (permalink / raw) To: linux-kernel Hello, In article <40F03665.90108@tlinx.org> you wrote: > My cases have been "vim" edited files. I'd sorta think once vim has > exited, the > data has been flushed, but that's just a WAG... just a small background investigation, I checked joe: open("test.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 write(3, "test\ntest\n", 10) = 10 close(3) = 0 ... which does not fsync... (it is also not an option in the source) and vim: rename("test.txt", "test.txz~") = 0 open("test.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 write(3, " test\ntest\n", 11) = 11 close(3) = 0 chmod("test.txt", 0100664) = 0 ... does no fsync, eighter. Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 23:03 ` Bernd Eckenfels @ 2004-07-12 23:14 ` Chris Wedgwood 0 siblings, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-12 23:14 UTC (permalink / raw) To: Bernd Eckenfels; +Cc: linux-kernel On Tue, Jul 13, 2004 at 01:03:04AM +0200, Bernd Eckenfels wrote: > open("test.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 old data blocks release (truncated), transaction for this written to journal more-or-less synchronously > write(3, "test\ntest\n", 10) = 10 > close(3) = 0 new data sitting in page-cache, not written to disk (in the case of XFS the new blocks probably aren't even allocted at this stage). the file size being extended is i assume recorded in the journal though. if you crash now, you see nulls or a truncated file, i think this is what people are getting with dotfiles KDE is especially good at triggering this it seems --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-09 21:59 ` Chris Wedgwood 2004-07-10 18:33 ` L A Walsh @ 2004-07-10 18:43 ` Jan Knutar 2004-07-10 18:46 ` Chris Wedgwood 1 sibling, 1 reply; 52+ messages in thread From: Jan Knutar @ 2004-07-10 18:43 UTC (permalink / raw) To: Chris Wedgwood; +Cc: L A Walsh, Norberto Bensa, linux-kernel On Saturday 10 July 2004 00:59, Chris Wedgwood wrote: > I *never* see this even when beating the hell out of machines and > trying to break things. I've seen this on a partition with NO other activity, than me editing a .c file with emacs in a project consisting of about 4 files in total, compiling and testingocasionally, editing again, etc... Then one day, powerloss, when power came back, the file was nothing but null. Atleast it had correct size and timestamp though, great comfort, that. :) ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:43 ` Jan Knutar @ 2004-07-10 18:46 ` Chris Wedgwood 2004-07-10 18:55 ` Norberto Bensa 0 siblings, 1 reply; 52+ messages in thread From: Chris Wedgwood @ 2004-07-10 18:46 UTC (permalink / raw) To: Jan Knutar; +Cc: L A Walsh, Norberto Bensa, linux-kernel On Sat, Jul 10, 2004 at 09:43:49PM +0300, Jan Knutar wrote: > I've seen this on a partition with NO other activity, than me > editing a .c file with emacs in a project consisting of about 4 > files in total, compiling and testingocasionally, editing again, > etc... Then one day, powerloss, when power came back, the file was > nothing but null. Atleast it had correct size and timestamp though, > great comfort, that. :) This is expected. XFS does not journal data. If you want that then use ext3 or reiserfs. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:46 ` Chris Wedgwood @ 2004-07-10 18:55 ` Norberto Bensa 2004-07-10 19:19 ` Chris Wedgwood ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Norberto Bensa @ 2004-07-10 18:55 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Jan Knutar, L A Walsh, linux-kernel Chris Wedgwood wrote: > XFS does not journal data. I think we all know that. The point, why the hell does it null files? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:55 ` Norberto Bensa @ 2004-07-10 19:19 ` Chris Wedgwood 2004-07-12 21:20 ` Chris Wedgwood [not found] ` <2hgxc-5x9-9@gated-at.bofh.it> 2004-07-10 19:33 ` Andreas Schwab 2004-07-11 1:21 ` Gopikrishnan Sidhardhan 2 siblings, 2 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-10 19:19 UTC (permalink / raw) To: Norberto Bensa; +Cc: Jan Knutar, L A Walsh, linux-kernel On Sat, Jul 10, 2004 at 03:55:26PM -0300, Norberto Bensa wrote: > I think we all know that. The point, why the hell does it null > files? A decision was made somewhere this is better than showing potentially bogus or confidential data, so on log-reply some parts of files may be zeroed. I can see arguments for an againts this and clearly for a lot of people the zeroing is a real pain. It would be nice for some people to prevent log-replay zeroing files but then something would have to be able to determine whether or not these blocks were newly allocated (and this might contain confidential data and need to be zeroed) or previously part of the file in which case we probably would like them left alone. I don't know any of the code well enough to know how easy this is or even if I'm telling the truth :) Hopefully someone who does can speak up on this. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 19:19 ` Chris Wedgwood @ 2004-07-12 21:20 ` Chris Wedgwood 2004-07-12 22:40 ` L A Walsh [not found] ` <2hgxc-5x9-9@gated-at.bofh.it> 1 sibling, 1 reply; 52+ messages in thread From: Chris Wedgwood @ 2004-07-12 21:20 UTC (permalink / raw) To: Norberto Bensa; +Cc: Jan Knutar, L A Walsh, linux-kernel On Sat, Jul 10, 2004 at 12:19:14PM -0700, Chris Wedgwood wrote: > It would be nice for some people to prevent log-replay zeroing files > but then something would have to be able to determine whether or not > these blocks were newly allocated (and this might contain > confidential data and need to be zeroed) or previously part of the > file in which case we probably would like them left alone. I told lies. > I don't know any of the code well enough to know how easy this is or > even if I'm telling the truth :) Hopefully someone who does can > speak up on this. I knew I was completely full of shit. XFS does *not* zero files, it simply returns zeros for unwritten extents. If you open an existing file and scribble all over it, you might see the old data during a crash, or the new data if it was flushed. You shouldn't see zero's though. What does happen though, is that dotfiles are truncated and rewritten, if the data blocks aren't flushed you will get zeros back because the extents were unwritten. This is really the only sensible thing to do given the circumstances. My guess is that with other fs' (when journaling metadata only) the blocks allocated for the newly written data are *usually* the same as the recently freed blocks from the truncate so things appear to work but in reality it's probably mostly luck. XFS could behave the same way, but sooner or later you will still loose when you get crap back instead of old data. Some applications just need to be fixed. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 21:20 ` Chris Wedgwood @ 2004-07-12 22:40 ` L A Walsh 2004-07-12 22:53 ` Chris Wedgwood 0 siblings, 1 reply; 52+ messages in thread From: L A Walsh @ 2004-07-12 22:40 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Norberto Bensa, Jan Knutar, L A Walsh, linux-kernel <aside> Chris, I'd never say you were full of shit or lied in any circumstance. Mistakes are human -- not being "full of shit" or "lying". Get over it. Don't inflate error. Seems to be related to pervasive belief that people have to be either all good or all bad or all perfect, or flawed, or all white hat or black hat, God or imperfect and to be "good" in programming field one must be perfect and never be _known_ to make a mistake (like some who portray others as dangerous or fools because they may hold knowledge of that name-callers' faults). The ones who pose the real danger are those who censor others because the "masses" can't handle the truth. Anyway it's unuseful to demean yourself or others. </aside> If it is of any help (I doubt it, it perplexes me)...the files I've written out with vim and have returned "nulls" have been files that were written out 2-3 _DAYS_ earlier -- often with more recent write having been saved fine. I've also seen sections in log files where blocks would return zero in the middle of a log. Obviously blocks before and after successfully made it to disk, but in _RARE_ circumstances (crashes and unplanned shutdowns are already rare enough, so it's a rare bug that only shows up on a 'rare' occasion...:-). Almost (shot in the dark), like some code that was supposed to zero unused but allocated datablocks got pointed at the wrong blocks, since these files are readable as having been written (yes may all be out of membuffs) and are often recoverable from the day's backup. If it was a file I just edited and then it crashed, that I could understand more than having files I haven't touched for a few days be zapped. -l Chris Wedgwood wrote: >On Sat, Jul 10, 2004 at 12:19:14PM -0700, Chris Wedgwood wrote: > > > >>It would be nice for some people to prevent log-replay zeroing files >>but then something would have to be able to determine whether or not >>these blocks were newly allocated (and this might contain >>confidential data and need to be zeroed) or previously part of the >>file in which case we probably would like them left alone. >> >> > >I told lies. > > > >>I don't know any of the code well enough to know how easy this is or >>even if I'm telling the truth :) Hopefully someone who does can >>speak up on this. >> >> > >I knew I was completely full of shit. > > >XFS does *not* zero files, it simply returns zeros for unwritten >extents. If you open an existing file and scribble all over it, you >might see the old data during a crash, or the new data if it was >flushed. You shouldn't see zero's though. > >What does happen though, is that dotfiles are truncated and rewritten, >if the data blocks aren't flushed you will get zeros back because the >extents were unwritten. This is really the only sensible thing to do >given the circumstances. > >My guess is that with other fs' (when journaling metadata only) the >blocks allocated for the newly written data are *usually* the same as >the recently freed blocks from the truncate so things appear to work >but in reality it's probably mostly luck. XFS could behave the same >way, but sooner or later you will still loose when you get crap back >instead of old data. > >Some applications just need to be fixed. > > > --cw > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 22:40 ` L A Walsh @ 2004-07-12 22:53 ` Chris Wedgwood 2004-07-13 1:44 ` Bernd Eckenfels 0 siblings, 1 reply; 52+ messages in thread From: Chris Wedgwood @ 2004-07-12 22:53 UTC (permalink / raw) To: L A Walsh; +Cc: Norberto Bensa, Jan Knutar, linux-kernel On Mon, Jul 12, 2004 at 03:40:08PM -0700, L A Walsh wrote: > If it is of any help (I doubt it, it perplexes me)...the files I've > written out with vim and have returned "nulls" have been files that > were written out 2-3 _DAYS_ earlier -- often with more recent write > having been saved fine. I've heard this before and you're not the only person to claim this. For a period of time the buffer-flushing code was broken and this was probably possible then, even sync/fsync failed to write stuff out. But that was a long time ago (last year) and I'm not sure that is still the case. It could be, the flushing code is quite complicated and I don't understand it fully, but testing seems to indicate it does work. To be quite honest I've never seen nulls in files that a days old, and I have scripts which checksum (md5) my files (hundreds of gigabytes) which would notice this, so knowing how to reproduce it would be nice. > I've also seen sections in log files where blocks would return zero > in the middle of a log. Log was being appended, system crashed, you get nulls at the end when rebootd, the logger opens the file append and starts writing stuff, the nulls end up in the middle. Arguably this is expected. > Obviously blocks before and after successfully made it to disk, but > in _RARE_ circumstances (crashes and unplanned shutdowns are already > rare enough, so it's a rare bug that only shows up on a 'rare' > occasion... :-) It can't be blocks before and after, if that was the case you wouldn't see the nulls. I'm pretty sure for you the nulls are not really on-disk, looking at the raw device you won't see them. They nulls are returns for unwritten extents just as nulls are returned for holes in sparse files. > Almost (shot in the dark), like some code that was supposed to zero > unused but allocated datablocks got pointed at the wrong blocks, > since these files are readable as having been written (yes may all > be out of membuffs) and are often recoverable from the day's backup. I cant see how. It seems to me that if block pointers got all messed up, xfs_repair would scream bloody murder and this explode and die on a live fs. I don't see reports that look like this. > If it was a file I just edited and then it crashed, that I could > understand more than having files I haven't touched for a few days > be zapped. My gut feeling is these files really are being changed. Stat should show if this is the case. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-12 22:53 ` Chris Wedgwood @ 2004-07-13 1:44 ` Bernd Eckenfels 2004-07-13 5:24 ` Chris Wedgwood 0 siblings, 1 reply; 52+ messages in thread From: Bernd Eckenfels @ 2004-07-13 1:44 UTC (permalink / raw) To: linux-kernel In article <20040712225338.GD23623@taniwha.stupidest.org> you wrote: > To be quite honest I've never seen nulls in files that a days old, and > I have scripts which checksum (md5) my files (hundreds of gigabytes) > which would notice this, so knowing how to reproduce it would be nice. I can say, that nulls in files are most common at the end of (sys)log files filing up to the next block boundary. I always asumed this is due to the fact that the filesize in the metadata was not written but the last half-finished block was already linked in the inode structure. I have never seen null filled data or config files other than that, but I do not have busy servers crashing often on me. > Log was being appended, system crashed, you get nulls at the end when > rebootd, the logger opens the file append and starts writing stuff, > the nulls end up in the middle. Arguably this is expected. Yes, and it is normally easy to spot, since the messages after the nulls are boot messages. > see the nulls. I'm pretty sure for you the nulls are not really > on-disk, looking at the raw device you won't see them. They nulls are > returns for unwritten extents just as nulls are returned for holes in > sparse files. ls -s compared with ls -l should make that visible? Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 1:44 ` Bernd Eckenfels @ 2004-07-13 5:24 ` Chris Wedgwood 0 siblings, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-13 5:24 UTC (permalink / raw) To: Bernd Eckenfels; +Cc: linux-kernel On Tue, Jul 13, 2004 at 03:44:52AM +0200, Bernd Eckenfels wrote: > I can say, that nulls in files are most common at the end of > (sys)log files filing up to the next block boundary. Ideally syslog would rewind back past an nulls when it opens files. > ls -s compared with ls -l should make that visible? No, unwritten extents has an on-disk place, just the data isn't written. I'm not sure if there is an easy way to tell if an extent is unritten or not, I guess you could use xfs_bmap -p if that's working right for you. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
[parent not found: <2hgxc-5x9-9@gated-at.bofh.it>]
* Re: XFS: how to NOT null files on fsck? [not found] ` <2hgxc-5x9-9@gated-at.bofh.it> @ 2004-07-13 7:25 ` Anton Ertl 2004-07-13 8:09 ` Chris Wedgwood 2004-07-13 22:24 ` Helge Hafting 0 siblings, 2 replies; 52+ messages in thread From: Anton Ertl @ 2004-07-13 7:25 UTC (permalink / raw) To: linux-kernel; +Cc: Chris Wedgwood, Jan Knutar, L A Walsh Chris Wedgwood <cw@f00f.org> writes: >XFS does *not* zero files, it simply returns zeros for unwritten >extents. If you open an existing file and scribble all over it, you >might see the old data during a crash, or the new data if it was >flushed. You shouldn't see zero's though. > >What does happen though, is that dotfiles are truncated and rewritten, >if the data blocks aren't flushed you will get zeros back because the >extents were unwritten. This is really the only sensible thing to do >given the circumstances. > >My guess is that with other fs' (when journaling metadata only) the >blocks allocated for the newly written data are *usually* the same as >the recently freed blocks from the truncate so things appear to work >but in reality it's probably mostly luck. A secure FS must ensure that other people's deleted data does not end up in the file. AFAIK FSs don't record owners for free blocks, so they can only ensure this by zeroing the blocks. So I doubt that you will see any different behaviour from an FS that keeps only meta-data consistent and writes meta-data before data. >Some applications just need to be fixed. It's too hard to fix the applications, since there is no easy way to test that they are really fixed. Also, the number of applications is much higher than the number of file systems. The way to go is to fix the file system (well, often it means a new FS). The file system should provide something that I call in-order semantics, i.e., that the disk state always represents an existing (possibly old) logical state of the FS, not some state that never existed, or some existing state with missing data. My favourite approach to achieve these semantics is based on log-structured file systems (see <http://www.complang.tuwien.ac.at/anton/lfs/> for some ideas and also a longer description of in-order semantics), but there are also other approaches: I believe that Soft Updates, when implemented correctly, provide in-order semantics, and Reiser4 may provide them, too. - anton -- M. Anton Ertl Some things have to be seen to be believed anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen http://www.complang.tuwien.ac.at/anton/home.html ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 7:25 ` Anton Ertl @ 2004-07-13 8:09 ` Chris Wedgwood 2004-07-13 9:34 ` Anton Ertl 2004-07-13 22:24 ` Helge Hafting 1 sibling, 1 reply; 52+ messages in thread From: Chris Wedgwood @ 2004-07-13 8:09 UTC (permalink / raw) To: Anton Ertl; +Cc: linux-kernel, Jan Knutar, L A Walsh On Tue, Jul 13, 2004 at 07:25:29AM +0000, Anton Ertl wrote: > A secure FS must ensure that other people's deleted data does not > end up in the file. AFAIK FSs don't record owners for free blocks, > so they can only ensure this by zeroing the blocks. How can free blocks have an owner? They wouldn't be free then. > So I doubt that you will see any different behaviour from an FS that > keeps only meta-data consistent and writes meta-data before data. You do, some fs' will return stale data. > It's too hard to fix the applications, since there is no easy way to > test that they are really fixed. No, it's not hard to fix the applications and it's easy to tell if they are fixed. > Also, the number of applications is much higher than the number of > file systems. You don't fix all applications, only ones where data is critical and their handling of it is poor. MTAs like postfix don't have a problem for example, they are generally written well. > The file system should provide something that I call in-order > semantics, i.e., that the disk state always represents an existing > (possibly old) logical state of the FS, not some state that never > existed, or some existing state with missing data. ext3 and reiserfs have what amounts to this as an option right now. It has some performance implications but I'm told works great. I don't think the current XFS behaviour is undesirable or broken. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 8:09 ` Chris Wedgwood @ 2004-07-13 9:34 ` Anton Ertl 2004-07-13 9:53 ` Chris Wedgwood 0 siblings, 1 reply; 52+ messages in thread From: Anton Ertl @ 2004-07-13 9:34 UTC (permalink / raw) To: Chris Wedgwood; +Cc: linux-kernel, Jan Knutar, L A Walsh Chris Wedgwood wrote: > > On Tue, Jul 13, 2004 at 07:25:29AM +0000, Anton Ertl wrote: > > > A secure FS must ensure that other people's deleted data does not > > end up in the file. AFAIK FSs don't record owners for free blocks, > > so they can only ensure this by zeroing the blocks. > > How can free blocks have an owner? They wouldn't be free then. It would be the former owner of the block. > > So I doubt that you will see any different behaviour from an FS that > > keeps only meta-data consistent and writes meta-data before data. > > You do, some fs' will return stale data. Stale data yes, but probably not stale data from blocks that were formerly free (or the file system is insecure). > > It's too hard to fix the applications, since there is no easy way to > > test that they are really fixed. > > No, it's not hard to fix the applications and it's easy to tell if > they are fixed. So, how do you tell? > > Also, the number of applications is much higher than the number of > > file systems. > > You don't fix all applications, only ones where data is critical and > their handling of it is poor. MTAs like postfix don't have a problem > for example, they are generally written well. Where is data not critical? I had such a problem even with a widely-used application like GNU Emacs (many years ago, may be fixed now), casting doubt on your claim that fixing the application is easy. > > The file system should provide something that I call in-order > > semantics, i.e., that the disk state always represents an existing > > (possibly old) logical state of the FS, not some state that never > > existed, or some existing state with missing data. > > ext3 and reiserfs have what amounts to this as an option right now. > It has some performance implications but I'm told works great. You mean ext3 data=journal? The last I heard about it was that it was broken. ext3 data=ordered will probably also work better in most cases than an FS with eager meta-data updates (like, apparently, XFS), but I don't think it guarantees in-order semantics. - anton ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 9:34 ` Anton Ertl @ 2004-07-13 9:53 ` Chris Wedgwood 2004-07-13 10:27 ` Tim Connors 2004-07-13 13:33 ` Anton Ertl 0 siblings, 2 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-13 9:53 UTC (permalink / raw) To: Anton Ertl; +Cc: linux-kernel, Jan Knutar, L A Walsh On Tue, Jul 13, 2004 at 11:34:54AM +0200, Anton Ertl wrote: > It would be the former owner of the block. there might not be a former owner (in most cases there probably isn't) > Stale data yes, but probably not stale data from blocks that were > formerly free (or the file system is insecure). some, like reiserfs apparently do (or did, it may be different now, if not used reiserfs for a long time) > So, how do you tell? code inspection and/or testing > Where is data not critical? that depends on the person and situation, for me personally lots of my data isn't critical. certainly it's annoying to loose data but probably not life threatening > I had such a problem even with a widely-used application like GNU > Emacs (many years ago, may be fixed now), casting doubt on your > claim that fixing the application is easy. emacs will usually rename the old file so at the very least you have that i've had emacs crash and whilst it's frustrating, it certainly isn't as bad as loosing an email (which may or may not be important, i'll decide that after i read it) > ext3 data=ordered will probably also work better in most cases than an > FS with eager meta-data updates (like, apparently, XFS), but I don't > think it guarantees in-order semantics. i thought that was the point of it? as best as i can tell the metadata changes will become visible after the data has updated however, in the case of something like kde/emacs/whatever you can *still* loose data consider something like: open with truncate crash or more likely: open with truncate write some data crash there is also an even more common case than either of these: open with truncate write data, get -ENOSPC spplication terminates/aborts at which point you've stomped on your file. it's non uncommong for KDE to do this (even though the window would apparently be very small) --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 9:53 ` Chris Wedgwood @ 2004-07-13 10:27 ` Tim Connors 2004-07-13 10:38 ` ismail dönmez 2004-07-13 10:58 ` Chris Wedgwood 2004-07-13 13:33 ` Anton Ertl 1 sibling, 2 replies; 52+ messages in thread From: Tim Connors @ 2004-07-13 10:27 UTC (permalink / raw) To: Chris Wedgwood; +Cc: Anton Ertl, linux-kernel, Jan Knutar, L A Walsh Chris Wedgwood <cw@f00f.org> said on Tue, 13 Jul 2004 02:53:00 -0700: > at which point you've stomped on your file. it's non uncommong for > KDE to do this (even though the window would apparently be very small) KDE is a peice of shit with regards to file handling. It seems they never learnt the lessons of writing files in Unix that have been learnt over the last 30 years. How the hell can you afford to hose your entire WM because KDE decides to write some obscure file at some time when the NFS servers just happen to be temporarily down? What ever happened to the standard practice of write to temp file, then atomic rename? What ever happened to making backups of critical files before overwriting them? Furrfu. Makes me glad I use a much more sane WM, but I pity those 3 users in the space of a few minutes who lost all of their settings. BTW, I have submitted the occasional bug to Debian because packages will cause dataloss to an /etc file if the disk happens to run out at the wrong moment (quite a common occurence for me). Furrfu people - this is so bloody simple to get right. -- TimC -- http://astronomy.swin.edu.au/staff/tconnors/ The prolonged application of polysyllabic vocabulary infallibly exercises a deleterious influence on the fecundity of expression, rendering the ultimate tendancy apocryphal. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 10:27 ` Tim Connors @ 2004-07-13 10:38 ` ismail dönmez 2004-07-13 11:16 ` Nick Piggin 2004-07-13 10:58 ` Chris Wedgwood 1 sibling, 1 reply; 52+ messages in thread From: ismail dönmez @ 2004-07-13 10:38 UTC (permalink / raw) To: Tim Connors Cc: Chris Wedgwood, Anton Ertl, linux-kernel, Jan Knutar, L A Walsh Trying to start a flame war with bitching about KDE? How about trying to solve at least work around it? No? Then please shut the fuck up. On Tue, 13 Jul 2004 20:27:30 +1000, Tim Connors <tconnors@astro.swin.edu.au> wrote: > KDE is a peice of shit with regards to file handling. > > It seems they never learnt the lessons of writing files in Unix that > have been learnt over the last 30 years. > > How the hell can you afford to hose your entire WM because KDE decides > to write some obscure file at some time when the NFS servers just > happen to be temporarily down? What ever happened to the standard > practice of write to temp file, then atomic rename? What ever happened > to making backups of critical files before overwriting them? Furrfu. > > Makes me glad I use a much more sane WM, but I pity those 3 users in > the space of a few minutes who lost all of their settings. > > BTW, I have submitted the occasional bug to Debian because packages > will cause dataloss to an /etc file if the disk happens to run out at > the wrong moment (quite a common occurence for me). Furrfu people - > this is so bloody simple to get right. > > -- > TimC -- http://astronomy.swin.edu.au/staff/tconnors/ > The prolonged application of polysyllabic vocabulary infallibly > exercises a deleterious influence on the fecundity of expression, > rendering the ultimate tendancy apocryphal. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Time is what you make of it ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 10:38 ` ismail dönmez @ 2004-07-13 11:16 ` Nick Piggin 2004-07-13 12:52 ` ismail dönmez 0 siblings, 1 reply; 52+ messages in thread From: Nick Piggin @ 2004-07-13 11:16 UTC (permalink / raw) To: ismail dönmez Cc: Tim Connors, Chris Wedgwood, Anton Ertl, linux-kernel, Jan Knutar, L A Walsh ismail dönmez wrote: > Trying to start a flame war with bitching about KDE? How about trying > to solve at least work around it? No? Then please shut the fuck up. > This isn't really acceptable on this mailing list. If you are offended by someone bitching about KDE, I politely suggest that you unsubscribe. Nick ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 11:16 ` Nick Piggin @ 2004-07-13 12:52 ` ismail dönmez 0 siblings, 0 replies; 52+ messages in thread From: ismail dönmez @ 2004-07-13 12:52 UTC (permalink / raw) To: Nick Piggin Cc: Tim Connors, Chris Wedgwood, Anton Ertl, linux-kernel, Jan Knutar, L A Walsh Ok sorry for bad language but as an XFS & KDE user I would like to see better discussion like there maybe some workarounds for this apart from them dumping KDE or XFS. About the KDE config stuff it happens in kconfig.cpp which lies under kdelibs/kdecore directory on CVS if you want to look at it. Again sorry for bad language. Cheers, ismail On Tue, 13 Jul 2004 21:16:42 +1000, Nick Piggin <nickpiggin@yahoo.com.au> wrote: > This isn't really acceptable on this mailing list. > > If you are offended by someone bitching about KDE, I politely > suggest that you unsubscribe. > > Nick > -- Time is what you make of it ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 10:27 ` Tim Connors 2004-07-13 10:38 ` ismail dönmez @ 2004-07-13 10:58 ` Chris Wedgwood 1 sibling, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-13 10:58 UTC (permalink / raw) To: Tim Connors, ismail d?nmez Cc: Anton Ertl, linux-kernel, Jan Knutar, L A Walsh On Tue, Jul 13, 2004 at 08:27:30PM +1000, Tim Connors wrote: > KDE is a peice of shit with regards to file handling. I personally would like to see KDE made more robust here (since I use it myself). I'm guessing it's probably not hard but I don't have a good feeling as the few times I have hacked KDE I was pretty disappointed how bad the code is. That said, my guess is common code handles most of this stuff so the right fixes in one or two places would probably cover everything. > Makes me glad I use a much more sane WM, but I pity those 3 users in > the space of a few minutes who lost all of their settings. I back my .kde ever now and then as a precaution. It's generally not a problem for me but as mentioned I am aware KDE could be better in this regard. Loosing window manager settings is a pain, loosing data from knotes and your bookmarks is very much more frustrating though. On Tue, Jul 13, 2004 at 01:38:40PM +0300, ismail d?nmez wrote: > Trying to start a flame war with bitching about KDE? I'm not sure he was. > How about trying to solve at least work around it? He doesn't use it, why would he bother? On the other hand, one day I might (I hope someone else does before me, the KDE code is scary). Since so many files are involved (I have 350 in my .kde) I suspect properly fixing this is going to be more involved that write, fsync, rename but it probably wouldn't be a bad place to start (only 43 of them were modified in the last day). --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 9:53 ` Chris Wedgwood 2004-07-13 10:27 ` Tim Connors @ 2004-07-13 13:33 ` Anton Ertl 2004-07-13 20:32 ` Chris Wedgwood 1 sibling, 1 reply; 52+ messages in thread From: Anton Ertl @ 2004-07-13 13:33 UTC (permalink / raw) To: Chris Wedgwood; +Cc: linux-kernel, Jan Knutar, L A Walsh Chris Wedgwood wrote: > > On Tue, Jul 13, 2004 at 11:34:54AM +0200, Anton Ertl wrote: > > > It would be the former owner of the block. > > there might not be a former owner (in most cases there probably isn't) If the owner of the file is not the former owner of the block, the FS certainly should not put the block in the file. > > So, how do you tell? > > code inspection and/or testing How do you test? Code inspection is good, but I think it needs to be complemented by testing. > > Where is data not critical? > > that depends on the person and situation, for me personally lots of my > data isn't critical. certainly it's annoying to loose data but > probably not life threatening We are balancing three things: making the file system nicer; working around non-nice file-systems in the applications; and losing data (even if it's just annoying rather than life-threatening). IMO losing data is the worst of these alternatives, and making file system nicer is the best one. > > I had such a problem even with a widely-used application like GNU > > Emacs (many years ago, may be fixed now), casting doubt on your > > claim that fixing the application is easy. > > emacs will usually rename the old file so at the very least you have > that Emacs does that only once per session, and I tend to stay in an Emacs session for days or weeks (and others probably do so, too). Then there is the auto-save file, but unfortunately eager meta-data updates trash that, too (see <http://www.complang.tuwien.ac.at/anton/sync-metadata-updates.html>). > > ext3 data=ordered will probably also work better in most cases than an > > FS with eager meta-data updates (like, apparently, XFS), but I don't > > think it guarantees in-order semantics. > > i thought that was the point of it? as best as i can tell the > metadata changes will become visible after the data has updated Right, but that's not sufficient. I am not an expert on ext3, but from the description I have read that's all it guarantees. If an application does a meta-data update, and then a data update, the disk state on crash might be that the data update was done and the meta-data update was not, which is not any of the states that ever existed logically. > however, in the case of something like kde/emacs/whatever you can > *still* loose data > > consider something like: > > open with truncate > crash > > or more likely: > > open with truncate > write some data > crash > > there is also an even more common case than either of these: > > open with truncate > write data, get -ENOSPC > spplication terminates/aborts > > at which point you've stomped on your file. it's non uncommong for > KDE to do this (even though the window would apparently be very small) There are certainly ways that an application can lose data even with a fully synchronous file system (which is the semantically nicest thing you can ask for (ignoring transactions)), but I am not talking about that. Applications can be tested against that relatively easily by killing the application and seeing if the files are ok. I am talking about ways that data can be lost because the file system does not have the nice semantics of a fully synchronous one. The in-order guarantee is something that can be implemented relatively efficiently and that does not add any local ways that data can be lost or become inconsistent (it does add ways to become inconsistent in distibuted applications, though, but there are fewer of these applications around, and their programmers are more used to thinking about concurrency, and thus hopefully better prepared to insert fsybcs etc. at the right place). - anton ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 13:33 ` Anton Ertl @ 2004-07-13 20:32 ` Chris Wedgwood 2004-07-13 22:42 ` Bernd Eckenfels 2004-07-14 18:49 ` Anton Ertl 0 siblings, 2 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-13 20:32 UTC (permalink / raw) To: Anton Ertl; +Cc: linux-kernel, Jan Knutar, L A Walsh On Tue, Jul 13, 2004 at 03:33:23PM +0200, Anton Ertl wrote: > If the owner of the file is not the former owner of the block, the FS > certainly should not put the block in the file. sorry, i dont understand that > How do you test? running the code and pressing reset or similar > We are balancing three things: making the file system nicer; working > around non-nice file-systems in the applications; and losing data > (even if it's just annoying rather than life-threatening). IMO losing > data is the worst of these alternatives, and making file system nicer > is the best one. all these things have trade-offs, plenty of people are happy with the current balance for those that are not you can use something else > Right, but that's not sufficient. I am not an expert on ext3, but > from the description I have read that's all it guarantees. If an > application does a meta-data update, and then a data update, the > disk state on crash might be that the data update was done and the > meta-data update was not, which is not any of the states that ever > existed logically. i don't see how for ordered updates that can occur, otherwise they wouldn't be ordered > Applications can be tested against that relatively easily by killing > the application and seeing if the files are ok. i've seen both KDE emacs loose data by crashing, does the fix for that belong in the fs too? > I am talking about ways that data can be lost because the file > system does not have the nice semantics of a fully synchronous one. mount -o sync > The in-order guarantee is something that can be implemented > relatively efficiently let's see a patch, please give details of performance differences i don't think the current situation is all bad or even undesirable, yes, it is a balance and i think it's fine as-is what you want a much more high-level semantics in the filesystem which possibly will have large performance implications. im not sure such semantics are *required* to be in the fs or should be there also, this is fixing the relatively rare case where the system crashes, which to be quite honest is a bigger concern, why no seek solutinos that deal with more common failure modes like applications crashing or bahaving badly? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 20:32 ` Chris Wedgwood @ 2004-07-13 22:42 ` Bernd Eckenfels 2004-07-14 18:49 ` Anton Ertl 1 sibling, 0 replies; 52+ messages in thread From: Bernd Eckenfels @ 2004-07-13 22:42 UTC (permalink / raw) To: linux-kernel In article <20040713203246.GB6614@taniwha.stupidest.org> you wrote: > running the code and pressing reset or similar hmm... perhaps an LD_PRELOAD wrapper (based on fakeroot) which logs all filenames of writes with no fsync (in addition to renames and unlinks) may easyly allow to find them by name. let me check that out, it could even overwrite close() (which will for sure make the system slower) Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 20:32 ` Chris Wedgwood 2004-07-13 22:42 ` Bernd Eckenfels @ 2004-07-14 18:49 ` Anton Ertl 2004-07-14 19:00 ` Chris Wedgwood 1 sibling, 1 reply; 52+ messages in thread From: Anton Ertl @ 2004-07-14 18:49 UTC (permalink / raw) To: Chris Wedgwood; +Cc: linux-kernel, Jan Knutar, L A Walsh Chris Wedgwood wrote: > > On Tue, Jul 13, 2004 at 03:33:23PM +0200, Anton Ertl wrote: > > > If the owner of the file is not the former owner of the block, the FS > > certainly should not put the block in the file. > > sorry, i dont understand that I'll try to put it another way: If a free block was last allocated to a file belonging to user U, then it may be ok (it's not a security problem) to put the block in a file belonging to user U on recovery; if not, then it's certainly not ok to put it into such a file without erasing it first. If you don't understand that, please let me know where I am losing you. > > How do you test? > > running the code and pressing reset or similar Ok, I was thinking about this testing methodology, too. That's not what I call easy, and it has led to the current situation where many applications are not safe against the not-so-nice crash semantics of many file systems. > > Right, but that's not sufficient. I am not an expert on ext3, but > > from the description I have read that's all it guarantees. If an > > application does a meta-data update, and then a data update, the > > disk state on crash might be that the data update was done and the > > meta-data update was not, which is not any of the states that ever > > existed logically. > > i don't see how for ordered updates that can occur, otherwise they > wouldn't be ordered Full-blown ordering is hard in a file system that overwrites allocated blocks. E.g., consider writing a little bit to block A, then writing something to block B, then writing something to block A again. For proper in-order semantics these writes have to occur in that order, and the first write to block A must not already include the second write; this becomes complicated with lazy writing. Soft Updates do funny things with the cache to get the ordering of operations right. I don't know if ext3 data=ordered does any of this, but the description "data updates are flushed to disk before transactions commit" does not sound like it does. OTOH, the data=ordered approach may be good enough for most applications (which deal with whole files rather than changes to parts of a file), so maybe any further effort will not provide enough benefit to gain much popularity. It's certainly much nicer than any eager-meta-data-update system like (apparently) XFS. > > Applications can be tested against that relatively easily by killing > > the application and seeing if the files are ok. > > i've seen both KDE emacs loose data by crashing, does the fix for that > belong in the fs too? Application crashing? No; I don't see how the file system can fix that. I have never seen Emacs lose data from crashing or (more frequently) being killed. Do you have an idea what went wrong in your case and how they In any case, if the developers have a hard time protecting even against application crashes/kills, I would not expect them to go to the effort and succeed in protecting against not-so-nice FS crash semantics. > > I am talking about ways that data can be lost because the file > > system does not have the nice semantics of a fully synchronous one. > > mount -o sync Very slow, and I would not trust it, because it probably receives very little testing. > > The in-order guarantee is something that can be implemented > > relatively efficiently > > let's see a patch, please give details of performance differences Take a look at <http://www.complang.tuwien.ac.at/czezatke/lfs.html>. For performance results look at Section 7 of <http://www.complang.tuwien.ac.at/papers/czezatke%26ertl00/>. I would not recommend using that stuff instead of any of the established FSs, but it may be good enough to answer your questions. > what you want a much more high-level semantics in the filesystem which > possibly will have large performance implications. I don't think that the performance implications are large in typical situations, in crontrast to the solution you proposed (mount -o sync). > im not sure such > semantics are *required* to be in the fs or should be there Required by whom? Me? Yes! > also, this is fixing the relatively rare case where the system > crashes, which to be quite honest is a bigger concern, why no seek > solutinos that deal with more common failure modes like applications > crashing or bahaving badly? What I am proposing is extending the solutions for that to also work for the system crash case. This will increase the incentive for the programmers to fix the application crash case. BTW, the way my current hardware acts up, system crashes are more frequent than application crashes, and certainly more frequent than applications behaving badly. - anton ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-14 18:49 ` Anton Ertl @ 2004-07-14 19:00 ` Chris Wedgwood 0 siblings, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-14 19:00 UTC (permalink / raw) To: Anton Ertl; +Cc: linux-kernel, Jan Knutar, L A Walsh On Wed, Jul 14, 2004 at 08:49:03PM +0200, Anton Ertl wrote: > If a free block was last allocated to a file belonging to user U, > then it may be ok (it's not a security problem) to put the block in > a file belonging to user U on recovery; if not, then it's certainly > not ok to put it into such a file without erasing it first. that's still a big security problem, consider files with restricted paths all of a sudden appearing or globally visible root-owned files appearing with old root-only data in them > I have never seen Emacs lose data from crashing or (more frequently) > being killed. Do you have an idea what went wrong in your case and > how they no idea, for a while it would segfault when you resized the window and you would loose everthing, (no crash handler to attempt to save things i guess) > Take a look at <http://www.complang.tuwien.ac.at/czezatke/lfs.html>. apples and oranges > BTW, the way my current hardware acts up, system crashes are more > frequent than application crashes, and certainly more frequent than > applications behaving badly. you need new hardware or a new system then this entire thread is dragging on and seems to have become a religious discussion about how XFS should because various people don't like it's current behavior despite the way things have worked that way for many many years i don't care if people use XFS or not, there are plenty of alternatives ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 7:25 ` Anton Ertl 2004-07-13 8:09 ` Chris Wedgwood @ 2004-07-13 22:24 ` Helge Hafting 2004-07-13 22:39 ` Chris Wedgwood ` (2 more replies) 1 sibling, 3 replies; 52+ messages in thread From: Helge Hafting @ 2004-07-13 22:24 UTC (permalink / raw) To: Anton Ertl; +Cc: linux-kernel, Chris Wedgwood, Jan Knutar, L A Walsh On Tue, Jul 13, 2004 at 07:25:29AM +0000, Anton Ertl wrote: > Chris Wedgwood <cw@f00f.org> writes: > >XFS does *not* zero files, it simply returns zeros for unwritten > >extents. If you open an existing file and scribble all over it, you > >might see the old data during a crash, or the new data if it was > >flushed. You shouldn't see zero's though. > > > >What does happen though, is that dotfiles are truncated and rewritten, > >if the data blocks aren't flushed you will get zeros back because the > >extents were unwritten. This is really the only sensible thing to do > >given the circumstances. > > > >My guess is that with other fs' (when journaling metadata only) the > >blocks allocated for the newly written data are *usually* the same as > >the recently freed blocks from the truncate so things appear to work > >but in reality it's probably mostly luck. > > A secure FS must ensure that other people's deleted data does not end > up in the file. AFAIK FSs don't record owners for free blocks, so > they can only ensure this by zeroing the blocks. So I doubt that you > will see any different behaviour from an FS that keeps only meta-data > consistent and writes meta-data before data. > There is another solution - zero blocks when freeing them. (Or put them on a list for later zeroing when the fs isn't busy, in order to kee��p good performance) With this approach you don't need to zero a half-written block after a crash, which means you destroy less data. Helge Hafting ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 22:24 ` Helge Hafting @ 2004-07-13 22:39 ` Chris Wedgwood 2004-07-13 23:23 ` Bernd Eckenfels 2004-07-14 18:53 ` Anton Ertl 2 siblings, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-13 22:39 UTC (permalink / raw) To: Helge Hafting; +Cc: Anton Ertl, linux-kernel, Jan Knutar, L A Walsh On Wed, Jul 14, 2004 at 12:24:11AM +0200, Helge Hafting wrote: > There is another solution - zero blocks when freeing them. slow > (Or put them on a list for later zeroing when the fs isn't busy, in > order to kee??????p good performance) complicated, doesn't buy as anything, it also means the blocks are tied up whilst they are being zeroed (consider a truncated on a multi-gb file, fairly common) > With this approach you don't need to zero a half-written > block after a crash, which means you destroy less data. it doesn't zero after a crash, what happens is the blocks never make it to disk and the metadata (which did make it to disk) reflects this so read returns nulls as is, you can truncate a multi-gb file, write over it and the only IO you see will be the new data being written out, zeroing in between would be horribly pianful --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 22:24 ` Helge Hafting 2004-07-13 22:39 ` Chris Wedgwood @ 2004-07-13 23:23 ` Bernd Eckenfels 2004-07-14 18:53 ` Anton Ertl 2 siblings, 0 replies; 52+ messages in thread From: Bernd Eckenfels @ 2004-07-13 23:23 UTC (permalink / raw) To: linux-kernel In article <20040713222411.GA1035@hh.idb.hist.no> you wrote: > With this approach you don't need to zero a half-written > block after a crash, which means you destroy less data. which does not change the fact that the block contains zeros if it was not written. :) Greetings Bernd -- eckes privat - http://www.eckes.org/ Project Freefire - http://www.freefire.org/ ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-13 22:24 ` Helge Hafting 2004-07-13 22:39 ` Chris Wedgwood 2004-07-13 23:23 ` Bernd Eckenfels @ 2004-07-14 18:53 ` Anton Ertl 2 siblings, 0 replies; 52+ messages in thread From: Anton Ertl @ 2004-07-14 18:53 UTC (permalink / raw) To: Helge Hafting; +Cc: linux-kernel, Chris Wedgwood, Jan Knutar, L A Walsh Helge Hafting wrote: > There is another solution - zero blocks when freeing them. (Or > put them on a list for later zeroing when the fs isn't busy, > in order to kee=EF=BF=BD=EF=BF=BDp good performance) > > With this approach you don't need to zero a half-written > block after a crash, which means you destroy less data. I don't think half-written blocks are the problem (at least not a frequent one). More typical is written meta-data without written data. In that case your solution will give the same result as the current solution, just at higher cost. - anton ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:55 ` Norberto Bensa 2004-07-10 19:19 ` Chris Wedgwood @ 2004-07-10 19:33 ` Andreas Schwab 2004-07-10 19:40 ` Chris Wedgwood 2004-07-10 19:46 ` Norberto Bensa 2004-07-11 1:21 ` Gopikrishnan Sidhardhan 2 siblings, 2 replies; 52+ messages in thread From: Andreas Schwab @ 2004-07-10 19:33 UTC (permalink / raw) To: Norberto Bensa; +Cc: Chris Wedgwood, Jan Knutar, L A Walsh, linux-kernel Norberto Bensa <norberto+linux-kernel@bensa.ath.cx> writes: > Chris Wedgwood wrote: >> XFS does not journal data. > > I think we all know that. The point, why the hell does it null files? Security. You don't want old contents of /etc/shadow appear in random files after a crash. Andreas. -- Andreas Schwab, SuSE Labs, schwab@suse.de SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 19:33 ` Andreas Schwab @ 2004-07-10 19:40 ` Chris Wedgwood 2004-07-10 19:46 ` Norberto Bensa 1 sibling, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-10 19:40 UTC (permalink / raw) To: Andreas Schwab; +Cc: Norberto Bensa, Jan Knutar, L A Walsh, linux-kernel On Sat, Jul 10, 2004 at 09:33:34PM +0200, Andreas Schwab wrote: > Security. You don't want old contents of /etc/shadow appear in > random files after a crash. If we had a different log format we could determine if the blocks were newly allocated and avoid zeroing that for existing files, we could even do the code to aggregate transactions which would be *really* nice for some things. Lots of work though. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 19:33 ` Andreas Schwab 2004-07-10 19:40 ` Chris Wedgwood @ 2004-07-10 19:46 ` Norberto Bensa 2004-07-10 20:03 ` Chris Wedgwood 1 sibling, 1 reply; 52+ messages in thread From: Norberto Bensa @ 2004-07-10 19:46 UTC (permalink / raw) To: Andreas Schwab; +Cc: Chris Wedgwood, Jan Knutar, L A Walsh, linux-kernel Andreas Schwab wrote: > Norberto Bensa <norberto+linux-kernel@bensa.ath.cx> writes: > > Chris Wedgwood wrote: > >> XFS does not journal data. > > > > I think we all know that. The point, why the hell does it null files? > > Security. You don't want old contents of /etc/shadow appear in random > files after a crash. Wow. You're telling me that XFS doesn't know if a given piece of the log is from file-a or file-b and just in case it zeroes its contents? If that's true, XFS has moved to my never-ever-use-it-again list. Thanks, Norberto ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 19:46 ` Norberto Bensa @ 2004-07-10 20:03 ` Chris Wedgwood 0 siblings, 0 replies; 52+ messages in thread From: Chris Wedgwood @ 2004-07-10 20:03 UTC (permalink / raw) To: Norberto Bensa; +Cc: Andreas Schwab, Jan Knutar, L A Walsh, linux-kernel On Sat, Jul 10, 2004 at 04:46:27PM -0300, Norberto Bensa wrote: > Wow. You're telling me that XFS doesn't know if a given piece of the > log is from file-a or file-b and just in case it zeroes its > contents? No. The log-replay can't tell where that block came from --- it might have been newly allocated and therfore need zeroing. > If that's true, XFS has moved to my never-ever-use-it-again list. There are many alternatives. --cw ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-10 18:55 ` Norberto Bensa 2004-07-10 19:19 ` Chris Wedgwood 2004-07-10 19:33 ` Andreas Schwab @ 2004-07-11 1:21 ` Gopikrishnan Sidhardhan 2 siblings, 0 replies; 52+ messages in thread From: Gopikrishnan Sidhardhan @ 2004-07-11 1:21 UTC (permalink / raw) To: linux-kernel Norberto Bensa wrote: > Chris Wedgwood wrote: > >>XFS does not journal data. > > > I think we all know that. The point, why the hell does it null files? See http://www-106.ibm.com/developerworks/linux/library/l-fs9.html - under the section 'Journaling'. Thanks, --GS ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-09 16:37 ` L A Walsh 2004-07-09 21:59 ` Chris Wedgwood @ 2004-07-29 1:30 ` Nathan Scott 2004-08-03 18:31 ` L A Walsh 1 sibling, 1 reply; 52+ messages in thread From: Nathan Scott @ 2004-07-29 1:30 UTC (permalink / raw) To: Norberto Bensa, L A Walsh; +Cc: linux-kernel, linux-xfs On Fri, Jul 09, 2004 at 09:37:48AM -0700, L A Walsh wrote: > It's a feature! :-) > > It's been in the code for years to randomly write nulls to some files Pfft, nonsense. The problem relates to an updated inode size being flushed ahead of the data behind it (hence a size update can make it out before delayed allocate extents do, and we end up with a hole beyond the end of file, which reads as zeroes). > Apparently not easily reproduced, no one has a clue why it does it. > Just does. No, its actually well known why it behaves this way. We are looking into ways to address this, and have some ideas - the trick is fixing it without hurting write performance - which we will do, its just not trivial. There are several techiques to reduce the impact of this behaviour, as others have described (or see the linux-xfs archives). cheers. -- Nathan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-07-29 1:30 ` Nathan Scott @ 2004-08-03 18:31 ` L A Walsh 2004-08-04 0:48 ` Andi Kleen 2004-08-05 8:16 ` Helge Hafting 0 siblings, 2 replies; 52+ messages in thread From: L A Walsh @ 2004-08-03 18:31 UTC (permalink / raw) To: Nathan Scott; +Cc: linux-kernel, linux-xfs On 07-28-04 Nathan Scott blissfully wrote: >On Fri, Jul 09, 2004 at 09:37:48AM -0700, L A Walsh wrote: > >>It's a feature! :-) >>It's been in the code for years to randomly write nulls to some files >> >Pfft, nonsense. > The above was meant somewhat tongue-in-cheek, ya know... > The problem relates to an updated inode size >being flushed ahead of the data behind it (hence a size update >can make it out before delayed allocate extents do, and we end >up with a hole beyond the end of file, which reads as zeroes). > I believe I understand the scenario you are talking about, but I don't think it fits the examples I have referred to. In particular, "/etc/fstab". I update 'fstab' on Tuesday, say, it works fine...gets backed up just fine...and I forget about it and move on. Then, 2-3 days later, my system crashes and doesn't want to some up. That's odd, usually after a crash, it just burps a bit and comes back up. I grumble and go for single user. Turns out my 1.2k fstab file is all "nulls". Coinidentally, I find, _maybe_, a couple of other files written around the same time, also nulled, including times when the nulls appeared in the system log for that time period! Now I know it takes a while before data may end up on disk and that it may not go out to disk in an ordered fashion, but 2-3 days? This isn't a case of a multi-extent file. My current fstab is only 1335 bytes long. I doubt it has ever been more than 2. My filesystems all use the Allocation unit (AU) size allowed. I wish for something larger than a 4k AU size but I'm told it is limited by the linux page size and to find a PC that uses the IA64 page size to use larger file AU size (but I haven't seen to many of these IA64 machines available from Dell or Gateway...:-) Maybe the code in FAT32 that handles larger AU's could be ported to XFS? If FAT32 can do it...nevermind... I'm sure there are more important issues on the plate. >>Apparently not easily reproduced, no one has a clue why it does it. >>Just does. >> >No, its actually well known why it behaves this way. >We are looking into ways to address this, and have some >ideas - the trick is fixing it without hurting write >performance - which we will do, its just not trivial. > You could increase the max AU size :-) But more seriously, is my example of writing a 1 AU sized file that becomes zeroed days later an example of the problem you are speaking of? >There are several techiques to reduce the impact of this >behaviour, as others have described (or see the linux-xfs >archives). > Like setting the disk for synchronous writes? Why not something in between, like guaranteeing the info on a mostly quiescent machine will be written to disk within an hour or so? Or is that not "it"? I haven't seen an incidence of this behavior in several months on my machines so my particular problem may have been fixed and the problem you speak of is unrelated to my own, but the number of unplanned shutdowns on my system has only increased recently, since I upgraded to the stable 2.6 series, whereas before, with 2.4, it could be months between "blue screens". Sad was the day that it was decided that the linux-kernel "corp" decided on feature development vs. stability in the "stable" kernel series. Isn't that criticism lodged most often against MS. It seems most "companies", incorporated or not, seem to follow similar growth patterns. Wasn't there an Eastern saying about choosing your enemies wisely for you will eventually become like them? -l ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-08-03 18:31 ` L A Walsh @ 2004-08-04 0:48 ` Andi Kleen 2004-08-04 6:37 ` L A Walsh 2004-08-05 8:16 ` Helge Hafting 1 sibling, 1 reply; 52+ messages in thread From: Andi Kleen @ 2004-08-04 0:48 UTC (permalink / raw) To: L A Walsh; +Cc: linux-kernel, linux-xfs, nathans L A Walsh <lkml@tlinx.org> writes: > Now I know it takes a while before data may end up on disk and that it > may not go out to disk in an ordered fashion, but 2-3 days? This isn't > a case of a multi-extent file. My current fstab is only 1335 bytes long. > I doubt it has ever been more than 2. Is this perhaps on a laptop? Some scripts for laptop use configure insanely long data flush times to conserve HD spin time. Sometimes it is even completely turned off (laptop mode). The extent flush is dependent on the configured bdflush or pdflushd data timeouts. The truncate is independent from this because it is flushed with a different path. -Andi ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-08-04 0:48 ` Andi Kleen @ 2004-08-04 6:37 ` L A Walsh 0 siblings, 0 replies; 52+ messages in thread From: L A Walsh @ 2004-08-04 6:37 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel, linux-xfs, nathans Not laptop, 2-CPU workstation used as home "server". :-) Andi Kleen wrote: >L A Walsh <lkml@tlinx.org> writes: > > > >>Now I know it takes a while before data may end up on disk and that it >>may not go out to disk in an ordered fashion, but 2-3 days? This isn't >>a case of a multi-extent file. My current fstab is only 1335 bytes long. >>I doubt it has ever been more than 2. >> >> > >Is this perhaps on a laptop? Some scripts for laptop use configure >insanely long data flush times to conserve HD spin time. Sometimes >it is even completely turned off (laptop mode). The extent >flush is dependent on the configured bdflush or pdflushd data >timeouts. > >The truncate is independent from this because it is flushed with a >different path. > >-Andi > > > > ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-08-03 18:31 ` L A Walsh 2004-08-04 0:48 ` Andi Kleen @ 2004-08-05 8:16 ` Helge Hafting 2004-08-06 1:10 ` Nathan Scott 1 sibling, 1 reply; 52+ messages in thread From: Helge Hafting @ 2004-08-05 8:16 UTC (permalink / raw) To: L A Walsh; +Cc: linux-kernel L A Walsh wrote: > >> The problem relates to an updated inode size >> being flushed ahead of the data behind it (hence a size update >> can make it out before delayed allocate extents do, and we end >> up with a hole beyond the end of file, which reads as zeroes). >> > I believe I understand the scenario you are talking about, but I don't > think it fits the examples I have referred to. In particular, > "/etc/fstab". > I update 'fstab' on Tuesday, say, it works fine...gets backed up just > fine...and I forget about it and move on. Then, 2-3 days later, my > system crashes and doesn't want to some up. That's odd, usually after > a crash, it just burps a bit and comes back up. I grumble and go for > single user. Turns out my 1.2k fstab file is all "nulls". > Coinidentally, > I find, _maybe_, a couple of other files written around the same time, > also nulled, including times when the nulls appeared in the system log > for that time period! > Now I know it takes a while before data may end up on disk and that it > may not go out to disk in an ordered fashion, but 2-3 days? Seems strange to me, but the amount of delay is entirely up to the filesystem. > This isn't > a case of a multi-extent file. My current fstab is only 1335 bytes long. > I doubt it has ever been more than 2. > My filesystems all use the Allocation unit (AU) size allowed. I wish > for something larger than a 4k AU size but I'm told it is limited by > the linux page size and to find a PC that uses the IA64 page size to > use larger file AU size (but I haven't seen to many of these IA64 > machines > available from Dell or Gateway...:-) Maybe the code in FAT32 that > handles > larger AU's could be ported to XFS? If FAT32 can do it...nevermind... > I'm sure there are more important issues on the plate. > >>> Apparently not easily reproduced, no one has a clue why it does it. >>> Just does. >> >> No, its actually well known why it behaves this way. >> We are looking into ways to address this, and have some >> ideas - the trick is fixing it without hurting write >> performance - which we will do, its just not trivial. >> > You could increase the max AU size :-) But more seriously, is my > example of writing a 1 AU sized file that becomes zeroed days later > an example of the problem you are speaking of? > >> There are several techiques to reduce the impact of this >> behaviour, as others have described (or see the linux-xfs >> archives). >> > Like setting the disk for synchronous writes? Why not something > in between, like guaranteeing the info on a mostly quiescent machine > will be written to disk within an hour or so? Or is that not "it"? > This should be trivial. Edit your crontab, so that cron will run "sync" once per hour. Everything queued for writing when the "sync" command is issued will be on disk when the command finishes. So this guarantees that nothing waits more than 1 hour. (Sync is usually over in a few seconds on a home machine. There should be no more lost "old" files unless the fs has a bug.) You may also want to run "sync" manually before doing something that risks crashing. (Such as moving a live machine, dubious hotplugging, testing beta device drivers . . .) > I haven't seen an incidence of this behavior in several months on > my machines so my particular problem may have been fixed and the > problem you speak of is unrelated to my own, but the number of > unplanned shutdowns on my system has only increased recently, since I > upgraded > to the stable 2.6 series, whereas before, with 2.4, it could be months > between "blue screens". You may want to keep using 2.4 for a while then - it probably _is_ a lot more stable. It has been stabilizing for the entire 2.5 development time, 2.6 stabilization has just begun! > > Sad was the day that it was decided that the linux-kernel "corp" decided > on feature development vs. stability in the "stable" kernel series. > Isn't that criticism lodged most often against MS. Not a big problem in this case. If XFS isn't stable enough for you, consider one of the many other filesystems. ext3 or reiserfs for journalling, or plain old ext2. The nice thing about having many features, is that you have a set to choose from. Helge Hafting ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-08-05 8:16 ` Helge Hafting @ 2004-08-06 1:10 ` Nathan Scott 2004-08-06 1:34 ` Andrew Morton 0 siblings, 1 reply; 52+ messages in thread From: Nathan Scott @ 2004-08-06 1:10 UTC (permalink / raw) To: L A Walsh, Helge Hafting; +Cc: linux-kernel On Thu, Aug 05, 2004 at 10:16:02AM +0200, Helge Hafting wrote: > L A Walsh wrote: > >Now I know it takes a while before data may end up on disk and that it > >may not go out to disk in an ordered fashion, but 2-3 days? > > Seems strange to me, but the amount of delay is entirely up to the > filesystem. The flushing of dirty file data is actually performed by kernel threads outside of the individual filesystems. I cannot explain a 2/3 day wait for data to get flushed, something really strange going on for you there. cheers. -- Nathan ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: XFS: how to NOT null files on fsck? 2004-08-06 1:10 ` Nathan Scott @ 2004-08-06 1:34 ` Andrew Morton 0 siblings, 0 replies; 52+ messages in thread From: Andrew Morton @ 2004-08-06 1:34 UTC (permalink / raw) To: Nathan Scott; +Cc: lkml, helge.hafting, linux-kernel Nathan Scott <nathans@sgi.com> wrote: > > On Thu, Aug 05, 2004 at 10:16:02AM +0200, Helge Hafting wrote: > > L A Walsh wrote: > > >Now I know it takes a while before data may end up on disk and that it > > >may not go out to disk in an ordered fashion, but 2-3 days? > > > > Seems strange to me, but the amount of delay is entirely up to the > > filesystem. > > The flushing of dirty file data is actually performed by > kernel threads outside of the individual filesystems. > > I cannot explain a 2/3 day wait for data to get flushed, > something really strange going on for you there. Well there was a writeback bug which could cause files to not get written back ever. Perhaps an unmount would cause writeback but nothing else would. It was fixed by the below patch. The situation will only arise with a combination of a race and a data-synchronising writeback (O_SYNC, fsync, etc). It's unlikely that this is the cause of this report though. From: Miklos Szeredi <miklos@szeredi.hu> This patch fixes a hard-to-trigger condition, where the inode is on the inode_in_use list while it's state is dirty. In this state dirty pages are not written back in sync() or from kupdate, only from direct page reclaim. And this causes a livelock in balance_dirty_pages after a while. The actual sequence of events required to get into this state is: thread function inode state inode list ---------------------------------------------------------------------------- 1 __sync_single_inode (background) I_DIRTY sb->s_io 1 do_writepages ... I_LOCKED 2 __writeback_single_inode (sync) sleeps I_LOCKED 1 __sync_single_inode (background) finish 0 inode_in_use 2 __writeback_single_inode (sync) wakeup 0 2 __sync_single_inode (sync) 0 2 do_writepages ... I_LOCKED 3 __mark_inode_dirty I_LOCKED | I_DIRTY 2 __sync_single_inode (sync) finish I_DIRTY left on inode_in_use Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> --- 25-akpm/fs/fs-writeback.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletion(-) diff -puN fs/fs-writeback.c~fix-inode-state-corruption-268-rc1-bk1 fs/fs-writeback.c --- 25/fs/fs-writeback.c~fix-inode-state-corruption-268-rc1-bk1 Fri Jul 16 15:06:57 2004 +++ 25-akpm/fs/fs-writeback.c Fri Jul 16 15:06:57 2004 @@ -213,8 +213,9 @@ __sync_single_inode(struct inode *inode, } else if (inode->i_state & I_DIRTY) { /* * Someone redirtied the inode while were writing back - * the pages: nothing to do. + * the pages. */ + list_move(&inode->i_list, &sb->s_dirty); } else if (atomic_read(&inode->i_count)) { /* * The inode is clean, inuse _ ^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2004-08-06 1:36 UTC | newest]
Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-05 5:47 XFS: how to NOT null files on fsck? Norberto Bensa
2004-07-09 16:37 ` L A Walsh
2004-07-09 21:59 ` Chris Wedgwood
2004-07-10 18:33 ` L A Walsh
2004-07-10 18:43 ` Chris Wedgwood
2004-07-10 21:24 ` Bernd Eckenfels
2004-07-11 21:54 ` Helge Hafting
2004-07-12 17:56 ` H. Peter Anvin
2004-07-12 19:59 ` Chris Wedgwood
2004-07-12 20:32 ` H. Peter Anvin
2004-07-12 22:29 ` Bernd Eckenfels
2004-07-12 23:03 ` Bernd Eckenfels
2004-07-12 23:14 ` Chris Wedgwood
2004-07-10 18:43 ` Jan Knutar
2004-07-10 18:46 ` Chris Wedgwood
2004-07-10 18:55 ` Norberto Bensa
2004-07-10 19:19 ` Chris Wedgwood
2004-07-12 21:20 ` Chris Wedgwood
2004-07-12 22:40 ` L A Walsh
2004-07-12 22:53 ` Chris Wedgwood
2004-07-13 1:44 ` Bernd Eckenfels
2004-07-13 5:24 ` Chris Wedgwood
[not found] ` <2hgxc-5x9-9@gated-at.bofh.it>
2004-07-13 7:25 ` Anton Ertl
2004-07-13 8:09 ` Chris Wedgwood
2004-07-13 9:34 ` Anton Ertl
2004-07-13 9:53 ` Chris Wedgwood
2004-07-13 10:27 ` Tim Connors
2004-07-13 10:38 ` ismail dönmez
2004-07-13 11:16 ` Nick Piggin
2004-07-13 12:52 ` ismail dönmez
2004-07-13 10:58 ` Chris Wedgwood
2004-07-13 13:33 ` Anton Ertl
2004-07-13 20:32 ` Chris Wedgwood
2004-07-13 22:42 ` Bernd Eckenfels
2004-07-14 18:49 ` Anton Ertl
2004-07-14 19:00 ` Chris Wedgwood
2004-07-13 22:24 ` Helge Hafting
2004-07-13 22:39 ` Chris Wedgwood
2004-07-13 23:23 ` Bernd Eckenfels
2004-07-14 18:53 ` Anton Ertl
2004-07-10 19:33 ` Andreas Schwab
2004-07-10 19:40 ` Chris Wedgwood
2004-07-10 19:46 ` Norberto Bensa
2004-07-10 20:03 ` Chris Wedgwood
2004-07-11 1:21 ` Gopikrishnan Sidhardhan
2004-07-29 1:30 ` Nathan Scott
2004-08-03 18:31 ` L A Walsh
2004-08-04 0:48 ` Andi Kleen
2004-08-04 6:37 ` L A Walsh
2004-08-05 8:16 ` Helge Hafting
2004-08-06 1:10 ` Nathan Scott
2004-08-06 1:34 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox