From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id q3N5xY0m127558 for ; Mon, 23 Apr 2012 00:59:34 -0500 Received: from ipmail04.adl6.internode.on.net (ipmail04.adl6.internode.on.net [150.101.137.141]) by cuda.sgi.com with ESMTP id TaLE8qmhRD8PjzBf for ; Sun, 22 Apr 2012 22:59:33 -0700 (PDT) Received: from disappointment ([192.168.1.1]) by dastard with esmtp (Exim 4.76) (envelope-from ) id 1SMCIy-0001aN-Qf for xfs@oss.sgi.com; Mon, 23 Apr 2012 15:59:20 +1000 Received: from dave by disappointment with local (Exim 4.77) (envelope-from ) id 1SMCIy-0004Y3-Mo for xfs@oss.sgi.com; Mon, 23 Apr 2012 15:59:20 +1000 From: Dave Chinner Subject: [PATCH 14/37] xfs: Use preallocation for inodes with extsz hints Date: Mon, 23 Apr 2012 15:58:44 +1000 Message-Id: <1335160747-17254-15-git-send-email-david@fromorbit.com> In-Reply-To: <1335160747-17254-1-git-send-email-david@fromorbit.com> References: <1335160747-17254-1-git-send-email-david@fromorbit.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com From: Dave Chinner xfstest 229 exposes a problem with buffered IO, delayed allocation and extent size hints. That is when we do delayed allocation during buffered IO, we reserve space for the extent size hint alignment and allocate the physical space to align the extent, but we do not zero the regions of the extent that aren't written by the write(2) syscall. The result is that we expose stale data in unwritten regions of the extent size hints. There are two ways to fix this. The first is to detect that we are doing unaligned writes, check if there is already a mapping or data over the extent size hint range, and if not zero the page cache first before then doing the real write. This can be very expensive for large extent size hints, especially if the subsequent writes fill then entire extent size before the data is written to disk. The second, and simpler way, is simply to turn off delayed allocation when the extent size hint is set and use preallocation instead. This results in unwritten extents being laid down on disk and so only the written portions will be converted. This matches the behaviour for direct IO, and will also work for the real time device. The disadvantage of this approach is that for small extent size hints we can get file fragmentation, but in general extent size hints are fairly large (e.g. stripe width sized) so this isn't a big deal. Implement the second approach as it is simple and effective. Signed-off-by: Dave Chinner Reviewed-by: Mark Tinguely --- fs/xfs/xfs_aops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 4588a7c..eff2ea8 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -1175,7 +1175,7 @@ __xfs_get_blocks( (!nimaps || (imap.br_startblock == HOLESTARTBLOCK || imap.br_startblock == DELAYSTARTBLOCK))) { - if (direct) { + if (direct || xfs_get_extsz_hint(ip)) { /* * Drop the ilock in preparation for starting the block * allocation transaction. It will be retaken -- 1.7.9.5 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs