From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andi Kleen Subject: Still data corruption with LTP doio on 2.6.17rc Date: Thu, 8 Jun 2006 12:44:12 +0200 Message-ID: <200606081244.13000.ak@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Cc: nfs@lists.sourceforge.net, linux-kernel@vger.kernel.org, trond.myklebust@fys.uio.no, okir@suse.de Return-path: To: neilb@cse.unsw.edu.au, resource@suse.de Sender: linux-kernel-owner@vger.kernel.org List-ID: I'm still seeing data corruption when running LTP over NFS between two 2.6.17rc* hosts. I already saw this before 2.6.16 and reported. Server is running knfsd 2.6.17-rc4-git9, client is running 2.6.17-rc6 with nfsroot. Both x86-64 and SUSE 10.0 userland. The file system is exported as async and mounted with /dev/root / nfs rw,vers=2,rsize=4096,wsize=4096,hard,nolock,proto=udp,timeo=11,retrans=2,addr=10.23.204.1 0 0 First I always get lots of do_vfs_lock: VFS is out of sync with lock manager! messages on the client. They don't seem to be directly related though. I set up ltp-full-20051103 on the NFS root and run it on the client with runltplite.sh. Eventually it reports <<>> tag=rwtest03 stime=1149754762 cmdline="export LTPROOT; rwtest -N rwtest03 -c -q -i 60s -n 2 -f buffered -s mmread,mmwrite -m random -Dv 10%25000:mm-buff-$$" contacts="" analysis=exit initiation_status="ok" <<>> doio(rwtest03) ( 8155) 08:19:23 --------------------- *** DATA COMPARISON ERROR *** check_file(/tmp/ltp-2256/mm-buff-8139, 7813848, 81293, U:8155:bigfoot:doio*, 20, 0) failed Comparison fd is 3, with open flags 0 Corrupt regions follow - unprintable chars are represented as '.' ----------------------------------------------------------------- corrupt bytes starting at file offset 7813848 1st 32 expected bytes: U:8155:bigfoot:doio*U:8155:bigfo 1st 32 actual bytes: ................................ Request number 36 fd 4 is file /tmp/ltp-2256/mm-buff-8139 - open flags are 02 O_RDWR, write done at file offset 7813848 - pattern is U (0125) number of requests is 1, strides per request is 1 i/o byte count = 81293 memory alignment is unaligned syscall: mmap-write(NULL, 12800000, PROT_WRITE, MAP_SHARED, 4, 0) file is mmaped to: 0x2b73b87f0000 file-mem=0x2b73b8f63ad8, length=81293, buffer=0x52d540 est03) ( 8152) 08:19:23 --------------------- (parent) pid 8155 exited because of data compare errors doio(rwtest03) ( 8154) 08:19:23 --------------------- *** DATA COMPARISON ERROR *** check_file(/tmp/ltp-2256/mm-buff-8139, 4223869, 130577, X:8154:bigfoot:doio*, 20, 0) failed Comparison fd is 4, with open flags 0 Corrupt regions follow - unprintable chars are represented as '.' ----------------------------------------------------------------- corrupt bytes starting at file offset 4349952 1st 32 expected bytes: 154:bigfoot:doio*X:8154:bigfoot: 1st 32 actual bytes: ................................ Request number 36 fd 3 is file /tmp/ltp-2256/mm-buff-8139 - open flags are 02 O_RDWR, write done at file offset 4223869 - pattern is X (0130) number of requests is 1, strides per request is 1 i/o byte count = 130577 memory alignment is unaligned syscall: mmap-write(NULL, 12800000, PROT_WRITE, MAP_SHARED, 3, 0) file is mmaped to: 0x2b73b87f0000 file-mem=0x2b73b8bf737d, length=130577, buffer=0x52d540 doio(rwtest03) ( 8152) 08:19:23 --------------------- (parent) pid 8154 exited because of data compare errors rwtest(rwtest03) : iogen reported errors (r=141) rwtest03 1 FAIL : Test failed <<>> duration=1 termination_type=exited termination_id=2 corefile=no cutime=1 cstime=7 <<>> <<>> tag=rwtest04 stime=1149754763 cmdline="export LTPROOT; rwtest -N rwtest04 -c -q -i 60s -n 2 -f sync -s mmread,mmwrite -m random -Dv 10%25000:mm-sync-$$" contacts="" analysis=exit initiation_status="ok" <<>> doio(rwtest04) ( 8174) 08:19:23 --------------------- *** DATA COMPARISON ERROR *** check_file(/tmp/ltp-2256/mm-sync-8159, 12102771, 46088, O:8174:bigfoot:doio*, 20, 0) failed Comparison fd is 4, with open flags 0 Corrupt regions follow - unprintable chars are represented as '.' ----------------------------------------------------------------- corrupt bytes starting at file offset 12128256 1st 32 expected bytes: 4:bigfoot:doio*O:8174:bigfoot:do 1st 32 actual bytes: ................................ Request number 11 fd 3 is file /tmp/ltp-2256/mm-sync-8159 - open flags are 010002 O_RDWR,O_SYNC, write done at file offset 12102771 - pattern is O (0117) number of requests is 1, strides per request is 1 i/o byte count = 46088 memory alignment is unaligned syscall: mmap-write(NULL, 12800000, PROT_WRITE, MAP_SHARED, 3, 0) file is mmaped to: 0x2b8b163cc000 file-mem=0x2b8b16f56c73, length=46088, buffer=0x52d546 doio(rwtest04) ( 8172) 08:19:23 --------------------- (parent) pid 8174 exited because of data compare errors rwtest(rwtest04) : doio reported errors (r=4) rwtest04 1 FAIL : doio reported errors (r=4) rwtest04 1 FAIL : Test failed <<>> duration=66 termination_type=exited termination_id=4 corefile=no cutime=33 cstime=170 <<>> -Andi