David Chinner wrote: > On Wed, May 09, 2007 at 05:54:09PM -0700, Jeremy Fitzhardinge wrote: > >> David Chinner wrote: >> >>> Suspend-resume, eh? >>> >>> There's an immediate suspect. Can you test this specifically for us? >>> i.e. download a known good file set, do some stuff, suspend, resume, >>> then check the files? If it doesn't show up the first time, can >>> you do it a few times just to rule it out? >>> >> Well, I've been doing suspend-resume with xfs for a while without >> problems; the problems seem to be recent and easily repeatable. Which >> just means that it could be a new suspend-resume problem, of course. >> > > Ok. I'm just trying to find a relatively simple test case for the > problem - seeing as you seem to be able to reliably reproduce this > we should be able to work out the trigger... > OK, I was able to reproduce it reliably with a script with did basically: for i in `seq 20`; do hg clone -U --pull a b-$i hg verify b-$i # always OK umount /home sleep 5 mount /home hg verify b-$i # often found truncated files done No suspend/resumes involved. The trees are linux kernel ones, so fairly large, but small enough to fit entirely in core. My script also captured xfs_bmap before/after output for files which had tended to be corrupted in the past, but unfortunately none of them got corrupted in these tests. But I do have all the trees lying around to extract more detail for if you like. Interestingly, the corruption happened in each case around the same place in the tree, often in the sata drivers. I wonder if that was just related to the timing of this script. Attaching script and results. J