From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 18AF729DF8 for ; Tue, 7 May 2013 02:11:42 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay2.corp.sgi.com (Postfix) with ESMTP id 01284304077 for ; Tue, 7 May 2013 00:11:38 -0700 (PDT) Received: from ipmail04.adl6.internode.on.net (ipmail04.adl6.internode.on.net [150.101.137.141]) by cuda.sgi.com with ESMTP id 2ZgKsVTzRMVwTgq2 for ; Tue, 07 May 2013 00:11:37 -0700 (PDT) Received: from dave by dastard with local (Exim 4.76) (envelope-from ) id 1UZc3C-0006Sw-Ex for xfs@oss.sgi.com; Tue, 07 May 2013 17:11:02 +1000 Date: Tue, 7 May 2013 17:11:02 +1000 From: Dave Chinner Subject: [problem] xfstests generic/311 unreliable... Message-ID: <20130507071102.GA24635@dastard> MIME-Version: 1.0 Content-Disposition: inline List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com Hi Josef, I was just looking at a generic/311, and I think there's something fundamentally wrong with the way it is checking the scratch device. You reported it was failing for internal test 19 on XFS, but I'm seeing is fail after the first test or 2, randomly. It has never made it past test 3. So I had a little bit of a closer look at it's structure. Essentially it is doing this (and the contents seen by each step: scratch dev + mkfs +-------------------------------+ overlay dm-flakey D-------------------------------D mount/write/kill/unmount dm-flakey Dx-x-x-x-x-x-x------------------D All good up to here. Now, you can _check_scratch_fs which sees: scratch dev + check +-------------------------------+ i.e. it's not seeing all the changes written to dm-flakey and so xfs-check it seeing corruption. After I realised this was stacking block devices and checking the underlying block device, the cause was pretty obvious: scratch-dev and dm-flakey have different address spaces, so changes written throughone address space will not be seen through the other address space if there is stale cached data in the original address space. And that's exactly what is happening. This patch: --- a/tests/generic/311 +++ b/tests/generic/311 @@ -79,6 +79,7 @@ _mount_flakey() _unmount_flakey() { $UMOUNT_PROG $SCRATCH_MNT + echo 3 > /proc/sys/vm/drop_caches } _load_flakey_table() Makes the problem go away for xfs_check. But really, I don't like the assumption that the test is built on - that writes through one block device are visible through another. It's just asking for weird problems. Is there some way that you can restructure this test so it doesn't have this problem (e.g. do everything on dm-flakey)? Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs