* zero size file after power failure with kernel 2.6.30.5
@ 2009-08-29 19:02 Michael Monnerie
2009-08-29 22:13 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Michael Monnerie @ 2009-08-29 19:02 UTC (permalink / raw)
To: xfs
I have /home mounted like this:
/dev/sda3 on /disks/work1 type xfs
(rw,noatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc)
Hardware: onboard SATA with a single WD VelociRaptor drive.
My power supply melted and so I had a power fail and a sudden death
crash.
( So please remember: even when you have a UPS, your power can fail ! )
After replacing the part, I had almost no isse with my KDE desktop. In
earlier XFS releases, I constantly lost several config files all
truncated to 0 length or at some point only contained NULLs on such
occasions. So the situation improved a lot.
But almost is not good enough: Exactly my kmail config file was 0 sized
- obviously: at least when I started kmail, it started fresh without any
accounts or config, but once I exited kmail the config was created with
the default values and about 12KB size, while my config has >200KB.
Shouldn't it be that this doesn't happen anymore? I'd love to be in a
position where I really can rely on a crash not trashing any of my files
anymore. I used to have reiserfs previously, and never, not a single
time despite many crashes, did I have such an issue. I'd really be
pleased so see such stability in XFS. I'm using barriers - what else
must I do?
mfg zmi
--
// Michael Monnerie, Ing.BSc ----- http://it-management.at
// Tel: 0660 / 415 65 31 .network.your.ideas.
// PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: zero size file after power failure with kernel 2.6.30.5 2009-08-29 19:02 zero size file after power failure with kernel 2.6.30.5 Michael Monnerie @ 2009-08-29 22:13 ` Eric Sandeen 2009-08-31 23:10 ` Peter Grandi [not found] ` <alpine.DEB.2.00.0908291517350.24777@p34.internal.lan> 2009-09-18 20:05 ` Martin Steigerwald 2 siblings, 1 reply; 9+ messages in thread From: Eric Sandeen @ 2009-08-29 22:13 UTC (permalink / raw) To: Michael Monnerie; +Cc: xfs Michael Monnerie wrote: > I have /home mounted like this: > /dev/sda3 on /disks/work1 type xfs > (rw,noatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc) > > Hardware: onboard SATA with a single WD VelociRaptor drive. > > My power supply melted and so I had a power fail and a sudden death > crash. > ( So please remember: even when you have a UPS, your power can fail ! ) > > After replacing the part, I had almost no isse with my KDE desktop. In > earlier XFS releases, I constantly lost several config files all > truncated to 0 length or at some point only contained NULLs on such > occasions. So the situation improved a lot. > > But almost is not good enough: Exactly my kmail config file was 0 sized > - obviously: at least when I started kmail, it started fresh without any > accounts or config, but once I exited kmail the config was created with > the default values and about 12KB size, while my config has >200KB. > > Shouldn't it be that this doesn't happen anymore? I'd love to be in a > position where I really can rely on a crash not trashing any of my files > anymore. I used to have reiserfs previously, and never, not a single > time despite many crashes, did I have such an issue. I'd really be > pleased so see such stability in XFS. I'm using barriers - what else > must I do? > > mfg zmi this will depend on what kde is doing internally as well. No filesystem can magically protect against buffered data loss on a crash. An application could certainly be doing something that results in this sort of thing. w/o reading some kde code I can't say for sure, and I don't mean to blame KDE, but this isn't necessarily a bug in xfs. -Eric _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: zero size file after power failure with kernel 2.6.30.5 2009-08-29 22:13 ` Eric Sandeen @ 2009-08-31 23:10 ` Peter Grandi 2009-09-01 7:18 ` Michael Monnerie 0 siblings, 1 reply; 9+ messages in thread From: Peter Grandi @ 2009-08-31 23:10 UTC (permalink / raw) To: Linux XFS [ ... ] >> Shouldn't it be that this doesn't happen anymore? I'd love to >> be in a position where I really can rely on a crash not >> trashing any of my files anymore. Then 'mount' with '-o sync', or write your own applications and patches to the GNU/Linux kernel to enforce well known atomicity and persistence semantics. This issue as related to several filesystems has been discussed in great depth over the past several months. Consider reading and if possible try to understand these contributions: http://sandeen.net/wordpress/?p=34 http://sandeen.net/wordpress/?p=42 https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781/comments/45 http://lwn.net/SubscriberLink/322823/e6979f02e5a73feb/ http://loupgaroublond.blogspot.com/2009/03/anecdote-about-why-doing-wrong-thing-is.html http://thunk.org/tytso/blog/2009/03/12/delayed-allocation-and-the-zero-length-file-problem/ http://mjg59.livejournal.com/108257.html http://www.csamuel.org/2009/04/11/default-ext3-mode-changing-in-2630 http://tribulaciones.org/2009/03/is-ext4-unsafe/ >> I used to have reiserfs previously, and never, not a single >> time despite many crashes, did I have such an issue. I'd >> really be pleased so see such stability in XFS. I'm using >> barriers - what else must I do? Barriers under GNU/Linux regrettably only enforce ordering. It is a POSIX weakness. Actual delays/semantics depend on kernel version. > this will depend on what kde is doing internally as well. Unfortunately KDE like most applications is known not to do the right thing (depending on specific app and version, but most IIRC). [ ... ] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: zero size file after power failure with kernel 2.6.30.5 2009-08-31 23:10 ` Peter Grandi @ 2009-09-01 7:18 ` Michael Monnerie 2009-09-01 10:32 ` Peter Grandi 0 siblings, 1 reply; 9+ messages in thread From: Michael Monnerie @ 2009-09-01 7:18 UTC (permalink / raw) To: xfs [-- Attachment #1.1: Type: text/plain, Size: 1258 bytes --] On Dienstag 01 September 2009 Peter Grandi wrote: > Then 'mount' with '-o sync' [snip] Yes. I could also simply switch back to reiserfs, where I never had this kind of issue, despite lots of crashes etc. I'm not here to blame the devs, just wanted to report that this kind of problem still exists, and maybe someone taps into the problem and can improve it. There was a similar problem with the change from ext3 to ext4, with a big discussion. Ext4 has been improved, I don't know how good it is now. And I know lots of discussions whether the app or the kernel is wrong, and whether you should fsync() after rename(). In ext4 they reorganized the way metaupdates are done, maybe that can help xfs too. It seems kmail writes its config every 7 minutes, so it is vulnerable for 3 seconds then. I've set vm.dirty_expire_centisecs = 1000 now to improve the situation a bit. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 [-- Attachment #1.2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: zero size file after power failure with kernel 2.6.30.5 2009-09-01 7:18 ` Michael Monnerie @ 2009-09-01 10:32 ` Peter Grandi 2009-09-01 14:19 ` Emmanuel Florac 2009-09-01 22:52 ` Michael Monnerie 0 siblings, 2 replies; 9+ messages in thread From: Peter Grandi @ 2009-09-01 10:32 UTC (permalink / raw) To: Linux XFS [ ... ] >> Then 'mount' with '-o sync' [ ... ] > Yes. I could also simply switch back to reiserfs, where I > never had this kind of issue, despite lots of crashes etc. Other people have a very different impression. Like 'ext3' ReiserFS does ordered writes, but those don't necessarily help because of the colossal amount of buffering that happens anyhow nowadays. > [ ... ] maybe someone taps into the problem and can improve > it. It is foremost an application problem, and then a block layer problem. The first is unsolvable ("user space sucks") in our lifetimes, and the second depends on the goodwill of the proprietors of the relevant kernel subsystem. As to application design, XFS is targeted at heavily parallel workloads on large storage arrays; its design takes advantage of what API semantics permit to improve that use case, and relies on applications making use of those API semantics properly. If that and having good scalable performance at the same time requires having dual power supplies, redundant storage paths, and battery backup, that is the typical platform on which XFS is deployed. > There was a similar problem with the change from ext3 to ext4, > with a big discussion. Ext4 has been improved, Actually it has been made worse, to compensate for bad application and block layer behaviour. Red Hat with 'ext4' have been trying to imply that an in-place upgrade to an 'ext3' compatible filesystem can support every possible point on the spectrum. Well, it turned out that they cannot. So there have been motions towards supporting XFS in 5.4, to have a dual-filesystem strategy, which is what a large number of their important enterprise customers do anyhow. > [ ... ] In ext4 they reorganized the way metaupdates are done, > maybe that can help xfs too. But that makes performance worse in the large/paralell case. > [ ... ] It seems kmail writes its config every 7 minutes, so > it is vulnerable for 3 seconds then. That won't help that much. Apps and the block layer are really designed for older, gentler times. And never mind the clueless, moronic "optimization" of Linux block layer plugging/unplugging. Currently a single disk can write 100MB/s, memory sizes on many _laptops_ are 4GB with potentially 1-2GB or 10-20s of writes cached. On a server one can have RAIDs that can write at/s. If applications and the block layer are misbehaving, and '-o sync' is not used, even if one flushes cache every second, there can still be dozens of MB (on a laptop) to some GB (on a server) that get lost in that one second. The filesystem can try hard to ensure that metadata gets written nearly immediately, ensuring 'fsck'-consistency, but it cannot do that for data in any sensible way unless the application and the block layer do the right thing, so data persistency is at best elusive. > I've set vm.dirty_expire_centisecs = 1000 now to improve the > situation a bit. It does not help that not only the applications and the block layer are misdesigned, but they also misdesigned for a time where data rates were a lot lower, so outstanding updates were bounded a lot lower. There are workarounds and by careful patching and changing default settings one can palliate the worst situations; but for example 10 seconds of 'dirty_expire_centisecs' seems way too long (IIRC you have a fairly large memory and RAID) and other settings matter more. I have written quite a bit in my blog about these issues, and you may find this particular entry rather relevant: http://www.sabi.co.uk/blog/0707jul.html#070701 In general on a fast machine I would use: vm/dirty_ratio =4 vm/dirty_background_ratio =2 vm/dirty_expire_centisecs =400 vm/dirty_writeback_centisecs =200 or half of every one. Short flushing times also ensure more continuous flushing (without huge periodic gulps), which can significantly improve *write* performance for streaming applications (XFS etc. delayed allocation is designed to improve read performance despite the lack of preallocation). This cannot be done on laptops, where short flushing times are bad for power consumption, but at least they are battery backed, and hopefully SSDs will save us anyhow. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: zero size file after power failure with kernel 2.6.30.5 2009-09-01 10:32 ` Peter Grandi @ 2009-09-01 14:19 ` Emmanuel Florac 2009-09-01 22:52 ` Michael Monnerie 1 sibling, 0 replies; 9+ messages in thread From: Emmanuel Florac @ 2009-09-01 14:19 UTC (permalink / raw) To: Linux XFS Le Tue, 1 Sep 2009 10:32:46 +0000 pg_xf2@xf2.sabi.co.UK (Peter Grandi) écrivait: > If that and having good scalable performance at the same time > requires having dual power supplies, redundant storage paths, > and battery backup, that is the typical platform on which XFS is > deployed. To mitigate this, I used systems with XFS daily for the last 13 years, (including IRIX workstations or PCs with only one drive) and had only once a problem clearly related to XFS (a well known bug, long corrected nowadays). -- ---------------------------------------- Emmanuel Florac | Intellique ---------------------------------------- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: zero size file after power failure with kernel 2.6.30.5 2009-09-01 10:32 ` Peter Grandi 2009-09-01 14:19 ` Emmanuel Florac @ 2009-09-01 22:52 ` Michael Monnerie 1 sibling, 0 replies; 9+ messages in thread From: Michael Monnerie @ 2009-09-01 22:52 UTC (permalink / raw) To: xfs On Dienstag 01 September 2009 Peter Grandi wrote: > Other people have a very different impression. Like 'ext3' > ReiserFS does ordered writes, but those don't necessarily help > because of the colossal amount of buffering that happens anyhow > nowadays. Maybe. I had reiserfs on this system until two weeks ago, with this quad-core 8GB desktop. Had power failures, crashes, and so on. Can't remember a situation where a KDE app lost its config. But I had a server with the OSS XEN, running a single VM which is my internal mailserver using PostgreSQL as it's store on XFS. My daughter managed to switch the server off (yeah, having redundant power supplies and UPS are still not enough). After reboot, the PostgreSQL database was *damaged*, so much that I had to restore. This should never have happened, and until now I don't know who was guilty for that: XFS? XEN? The RAID Controller with BBU and hard disk cache=off? That's why I'm very sensible to even a small data loss (I had a backup of my kmail config), and I think the filesystem has to do everything to try to keep my data. XFS seems to be optimized more for speed before security, would you mean that? I've often heard "enterprise hardware", which sounds like "if anything crashes, it's your problem" ;-) > http://www.sabi.co.uk/blog/0707jul.html#070701 I like your blog, and http://www.myri.com/scs/READMES/README.myri10ge-linux gave me a good hint to optimize tcp settings a long time ago. > In general on a fast machine I would use: > vm/dirty_ratio =4 > vm/dirty_background_ratio =2 > vm/dirty_expire_centisecs =400 > vm/dirty_writeback_centisecs =200 Since May I use these new settings with kernel 2.6.(29|30): vm.dirty_background_bytes = 16123456 vm.dirty_bytes = 250123456 vm.dirty_expire_centisecs = 1000 vm.dirty_writeback_centisecs = 100 (the expire was on 3000 until the crash). mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <alpine.DEB.2.00.0908291517350.24777@p34.internal.lan>]
* Re: zero size file after power failure with kernel 2.6.30.5 [not found] ` <alpine.DEB.2.00.0908291517350.24777@p34.internal.lan> @ 2009-08-30 8:39 ` Michael Monnerie 0 siblings, 0 replies; 9+ messages in thread From: Michael Monnerie @ 2009-08-30 8:39 UTC (permalink / raw) To: Justin Piszcz, xfs On Samstag 29 August 2009 Justin Piszcz wrote: > I was curious if you could show your smartctl -a output? > > smartctl -a /dev/sda Here the important parts of it: Device Model: WDC WD1500HLFS-01G6U0 Firmware Version: 04.04V01 ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0003 190 185 021 Pre-fail Always - 1466 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 48 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x000e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 092 092 000 Old_age Always - 6282 10 Spin_Retry_Count 0x0012 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0012 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 48 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 13 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 48 194 Temperature_Celsius 0x0022 112 102 000 Old_age Always - 31 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 PNum Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 6274 - # 2 Extended offline Completed without error 00% 6251 - # 3 Short offline Completed without error 00% 6250 - # 4 Short offline Completed without error 00% 6226 - # 5 Short offline Completed without error 00% 6202 - # 6 Short offline Completed without error 00% 6187 - # 7 Extended offline Completed without error 00% 6167 - # 8 Short offline Completed without error 00% 6145 - # 9 Short offline Completed without error 00% 6121 - #10 Short offline Completed without error 00% 6097 - #11 Extended offline Completed without error 00% 6090 - #12 Extended offline Completed without error 00% 6024 - #13 Short offline Completed without error 00% 3613 - #14 Short offline Completed without error 00% 3589 - #15 Extended offline Completed without error 00% 3567 - #16 Short offline Completed without error 00% 3565 - #17 Short offline Completed without error 00% 3541 - #18 Short offline Completed without error 00% 3517 - #19 Short offline Completed without error 00% 3493 - #20 Short offline Completed without error 00% 3469 - #21 Short offline Completed without error 00% 3445 - > The SU issue most likely resulted in the xfs file going to 0 issue > (i see the same thing on occasion during a PSU issue or crash/reboot) Yes, and that's annoying. I've never had that for reiserfs, so I guess it's really XFS to blame here. I like that filesystem, but such things really shouldn't happen. > however I am curious to see if you see any of the issues here: > http://forums.storagereview.net/index.php?showtopic=27303&hl=velociraptor > > Since you only use one drive and not a raid of the WD Velociraptors, > you may not be affected, but I was curious, thanks. In fact I use 3 VelociRaptor 150GB and 6 Raptors. 8 are in a single RAID-6 in a server running since about 3 years, and during this time 2 Raptors died and were replaced by VelociRaptors. The one I use in my desktop is a spare spart to use in case the server needs one. I just use it because the speed is nice, and I see if it has failures. Smartctl is running as a daemon here, so you can see the SMART self tests. It's almost definitely not the drive. mfg zmi -- // Michael Monnerie, Ing.BSc ----- http://it-management.at // Tel: 0660 / 415 65 31 .network.your.ideas. // PGP Key: "curl -s http://zmi.at/zmi.asc | gpg --import" // Fingerprint: AC19 F9D5 36ED CD8A EF38 500E CE14 91F7 1C12 09B4 // Keyserver: wwwkeys.eu.pgp.net Key-ID: 1C1209B4 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: zero size file after power failure with kernel 2.6.30.5 2009-08-29 19:02 zero size file after power failure with kernel 2.6.30.5 Michael Monnerie 2009-08-29 22:13 ` Eric Sandeen [not found] ` <alpine.DEB.2.00.0908291517350.24777@p34.internal.lan> @ 2009-09-18 20:05 ` Martin Steigerwald 2 siblings, 0 replies; 9+ messages in thread From: Martin Steigerwald @ 2009-09-18 20:05 UTC (permalink / raw) To: xfs, Michael Monnerie [-- Attachment #1.1: Type: Text/Plain, Size: 1730 bytes --] Am Samstag 29 August 2009 schrieb Michael Monnerie: > I have /home mounted like this: > /dev/sda3 on /disks/work1 type xfs > (rw,noatime,logbufs=8,logbsize=256k,attr2,barrier,largeio,swalloc) > > Hardware: onboard SATA with a single WD VelociRaptor drive. > > My power supply melted and so I had a power fail and a sudden death > crash. > ( So please remember: even when you have a UPS, your power can fail ! ) [...] > But almost is not good enough: Exactly my kmail config file was 0 sized > - obviously: at least when I started kmail, it started fresh without > any accounts or config, but once I exited kmail the config was created > with the default values and about 12KB size, while my config has > >200KB. Most likely missing-fsync() issue that still could happen with XFS. Thats a long discussion ;-). Try # KDE Sync # http://oss.sgi.com/pipermail/xfs/2009-March/040628.html export KDE_EXTRA_FSYNC=1 This environment variable didn't have any effect with KDE 3 but should work with recent KDE versions. See also: http://bugs.kde.org/187172 I switched to Ext4 for my work notebook in the meantime, but my Amarok laptop is still using XFS. Ext4 skips delayed allocation for certain cases, AFAIR truncates and renames, since kernel 2.6.30 as Linus and others urged Theodore T'so to make Ext4 behave nicely with applications. XFS only does so for truncates and there is a little race still, AFAIK. In the meantime I tend to agree that the filesystem should play it safe - POSIX semantics or not. But I did not completely made up my mind yet. Ciao, -- Martin 'Helios' Steigerwald - http://www.Lichtvoll.de GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7 [-- Attachment #1.2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-09-18 20:04 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-29 19:02 zero size file after power failure with kernel 2.6.30.5 Michael Monnerie
2009-08-29 22:13 ` Eric Sandeen
2009-08-31 23:10 ` Peter Grandi
2009-09-01 7:18 ` Michael Monnerie
2009-09-01 10:32 ` Peter Grandi
2009-09-01 14:19 ` Emmanuel Florac
2009-09-01 22:52 ` Michael Monnerie
[not found] ` <alpine.DEB.2.00.0908291517350.24777@p34.internal.lan>
2009-08-30 8:39 ` Michael Monnerie
2009-09-18 20:05 ` Martin Steigerwald
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox