public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: xfs@oss.sgi.com, Andrew Dahl <adahl@sgi.com>
Subject: Re: [PATCH 02.5/32] xfs: remove xfs_tosspages
Date: Wed, 21 Nov 2012 19:05:02 +1100	[thread overview]
Message-ID: <20121121080502.GP2591@dastard> (raw)
In-Reply-To: <50A3F80C.7050502@sgi.com>

On Wed, Nov 14, 2012 at 01:59:08PM -0600, Mark Tinguely wrote:
> On 11/14/12 12:52, Andrew Dahl wrote:
> >
> >Reversing the check on XFS_IOC_ZERO_RANGE.
> >
> >Range should be zeroed if the start is less than or equal to the end.
> >
> >Signed-off-by: Andrew Dahl<adahl@sgi.com>
> >
> >---
> 
> Tests correctly.

Actually, it doesn't. Test 242 still fails. Yeah, there was already
a regression test for this case, it's just that the golden output
wasn't correct so it never detected the single first block zero
failure even though it was tested.  Now it throws an md5sum mismatch
error, indicating that the behaviour has changed iin some unexpected
way and something is not right with the world.

$ sudo ./check 242
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test-1 3.7.0-rc1-dgc+
MKFS_OPTIONS  -- -f -bsize=4096 /dev/vdb
MOUNT_OPTIONS -- /dev/vdb /mnt/scratch

242      - output mismatch (see 242.out.bad)
--- 242.out     2012-11-21 13:13:22.000000000 +1100
+++ 242.out.bad 2012-11-21 15:41:02.000000000 +1100
@@ -74,4 +74,4 @@
 eecb7aa303d121835de05028751d301c
        17. data -> hole in single block file
 0: [0..7]: unwritten
-56819989ef2d9f40785adce8c06b64d0
+5fed275e7617a806f94c173746a2a723
Ran: 242
Failures: 242
Failed 1 of 1 tests

[ Here's a tip for the future: anything that changes allocation
corner cases needs to be run through the entire of xfstests suite
because they have a nasty habit of causing secondary problems.... ]

I can confirm that the page cache page is not being tossed for
this case (end is -1, start is 128) so the fix for the problem in
the commit is good, but there's more problems here. Clearly it is
that there is data in the page cache:

@@ -74,4 +74,7 @@
 eecb7aa303d121835de05028751d301c
        17. data -> hole in single block file
 0: [0..7]: unwritten
-56819989ef2d9f40785adce8c06b64d0
+0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
+*
+0001000
+5fed275e7617a806f94c173746a2a723

And that is wrong, wrong, wrong for an unwritten extent.

So, before even looking for the bug, what's the correct behaviour
here?  It's not directly specified in the man page, but XFS_IOC_ZERO
was really only implemented to zero whole blocks.  However, it makes
sense to handle partial blocks in a sane and consistent manner,
zeroing them correctly similar to XFS_IOC_UNRESVSP and hence
providing full byte range zeroing capability.

With this in mind, I look just looked at test 290 in more detail.
To me, the basic premise of the test is fundamentally wrong:

# Nothing should be tossed unless the range includes a page boundry

XFS_IOC_ZERO's functionality is not defined by page boundaries or
kernel internal behaviours - they may influence behaviour, but they
certainly don't define the behaviour. What I see in test 290 is an
encoding of the current truncate_pagecache_range() semantics, not an
encoding of the intent of XFS_IOC_ZERO_RANGE.  I didn't pay enough
attention to what this test was doing in the first place (my fault),
but the current behaviour is, IMO, borderline insane. :/

So, lets just make it sane by updating XFS_IOC_ZERO_RANGE to full
byte range granularity - it's simple enough to do. We can fix 242
and 290 quickly enough, anyway...

FWIW, this isn't currently optimal (we can avoid zeroing if the
partial blocks fall on holes or unwritten extents), but is a minor
problem compared to correct behaviour, and so that can be fixed
later.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: byte range granularity for XFS_IOC_ZERO_RANGE

From: Dave Chinner <dchinner@redhat.com>

XFS_IOC_ZERO_RANGE simply does not work properly for non page cache
aligned ranges. Neither test 242 or 290 exercise this correctly, so
the behaviour is completely busted even though the tests pass.

Fix it to support full byte range granularity as was originally
intended for this ioctl.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_file.c     |    2 +-
 fs/xfs/xfs_vnodeops.c |   84 ++++++++++++++++++++++++++++++++++++-------------
 fs/xfs/xfs_vnodeops.h |    1 +
 3 files changed, 65 insertions(+), 22 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 400b187..67284ed 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -86,7 +86,7 @@ xfs_rw_ilock_demote(
  *	valid before the operation, it will be read from disk before
  *	being partially zeroed.
  */
-STATIC int
+int
 xfs_iozero(
 	struct xfs_inode	*ip,	/* inode			*/
 	loff_t			pos,	/* offset in file		*/
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 2688079..544e9f1 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -2095,6 +2095,61 @@ xfs_free_file_space(
 	return error;
 }
 
+
+STATIC int
+xfs_zero_file_space(
+	struct xfs_inode	*ip,
+	xfs_off_t		offset,
+	xfs_off_t		len,
+	int			attr_flags)
+{
+	struct xfs_mount	*mp = ip->i_mount;
+	uint			rounding;
+	xfs_off_t		start;
+	xfs_off_t		end;
+	int			error;
+
+	rounding = max_t(uint, 1 << mp->m_sb.sb_blocklog, PAGE_CACHE_SIZE);
+
+	/* round the range iof extents we are going to convert inwards */
+	start = round_up(offset, rounding);
+	end = round_down(offset + len, rounding);
+
+	ASSERT(start >= offset);
+	ASSERT(end <= offset + len);
+
+	if (!(attr_flags & XFS_ATTR_NOLOCK))
+		xfs_ilock(ip, XFS_IOLOCK_EXCL);
+
+	if (start < end - 1) {
+		/* punch out the page cache over the conversion range */
+		truncate_pagecache_range(VFS_I(ip), start, end - 1);
+		/* convert the blocks */
+		error = xfs_alloc_file_space(ip, start, end - start - 1,
+				    XFS_BMAPI_PREALLOC | XFS_BMAPI_CONVERT,
+				    attr_flags);
+		if (error)
+			goto out_unlock;
+	} else {
+		/* it's a sub-rounding range */
+		ASSERT(offset + len <= rounding);
+		error = xfs_iozero(ip, offset, len);
+		goto out_unlock;
+	}
+
+	/* now we've handled the interior of the range, handle the edges */
+	if (start != offset)
+		error = xfs_iozero(ip, offset, start - offset);
+	if (!error && end != offset + len)
+		error = xfs_iozero(ip, end, offset + len - end);
+
+out_unlock:
+	if (!(attr_flags & XFS_ATTR_NOLOCK))
+		xfs_iunlock(ip, XFS_IOLOCK_EXCL);
+	return error;
+
+}
+
 /*
  * xfs_change_file_space()
  *      This routine allocates or frees disk space for the given file.
@@ -2120,10 +2175,8 @@ xfs_change_file_space(
 	xfs_fsize_t	fsize;
 	int		setprealloc;
 	xfs_off_t	startoffset;
-	xfs_off_t	end;
 	xfs_trans_t	*tp;
 	struct iattr	iattr;
-	int		prealloc_type;
 
 	if (!S_ISREG(ip->i_d.di_mode))
 		return XFS_ERROR(EINVAL);
@@ -2172,31 +2225,20 @@ xfs_change_file_space(
 	startoffset = bf->l_start;
 	fsize = XFS_ISIZE(ip);
 
-	/*
-	 * XFS_IOC_RESVSP and XFS_IOC_UNRESVSP will reserve or unreserve
-	 * file space.
-	 * These calls do NOT zero the data space allocated to the file,
-	 * nor do they change the file size.
-	 *
-	 * XFS_IOC_ALLOCSP and XFS_IOC_FREESP will allocate and free file
-	 * space.
-	 * These calls cause the new file data to be zeroed and the file
-	 * size to be changed.
-	 */
 	setprealloc = clrprealloc = 0;
-	prealloc_type = XFS_BMAPI_PREALLOC;
-
 	switch (cmd) {
 	case XFS_IOC_ZERO_RANGE:
-		prealloc_type |= XFS_BMAPI_CONVERT;
-		end = round_down(startoffset + bf->l_len, PAGE_SIZE) - 1;
-		if (startoffset <= end)
-			truncate_pagecache_range(VFS_I(ip), startoffset, end);
-		/* FALLTHRU */
+		error = xfs_zero_file_space(ip, startoffset, bf->l_len,
+						attr_flags);
+		if (error)
+			return error;
+		setprealloc = 1;
+		break;
+
 	case XFS_IOC_RESVSP:
 	case XFS_IOC_RESVSP64:
 		error = xfs_alloc_file_space(ip, startoffset, bf->l_len,
-						prealloc_type, attr_flags);
+						XFS_BMAPI_PREALLOC, attr_flags);
 		if (error)
 			return error;
 		setprealloc = 1;
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index 91a03fa..5163022 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -49,6 +49,7 @@ int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		int flags, struct attrlist_cursor_kern *cursor);
 
+int xfs_iozero(struct xfs_inode *, loff_t, size_t);
 int xfs_zero_eof(struct xfs_inode *, xfs_off_t, xfs_fsize_t);
 int xfs_free_eofblocks(struct xfs_mount *, struct xfs_inode *, bool);
 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2012-11-21  8:02 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-12 11:53 [PATCH 00/32] xfs: current queue for 3.8 Dave Chinner
2012-11-12 11:53 ` [PATCH 01/32] xfs: add more attribute tree trace points Dave Chinner
2012-11-12 22:11   ` Mark Tinguely
2012-11-15 16:18   ` Christoph Hellwig
2012-11-12 11:53 ` [PATCH 02/32] xfs: remove xfs_tosspages Dave Chinner
2012-11-14  6:42   ` [PATCH 02/32 V2] " Dave Chinner
2012-11-14 18:50     ` Andrew Dahl
2012-11-14 18:52       ` [PATCH 02.5/32] " Andrew Dahl
2012-11-14 19:59         ` Mark Tinguely
2012-11-21  8:05           ` Dave Chinner [this message]
2012-11-22  5:10             ` Andrew Dahl
2012-11-22 23:29               ` Dave Chinner
2012-11-26 18:04                 ` Andrew Dahl
2012-11-14 21:17       ` [PATCH 02/32 V2] " Dave Chinner
2012-11-15 16:22     ` Christoph Hellwig
2012-11-12 11:53 ` [PATCH 03/32] xfs: remove xfs_wait_on_pages() Dave Chinner
2012-11-15 16:23   ` Christoph Hellwig
2012-11-12 11:53 ` [PATCH 04/32] xfs: remove xfs_flush_pages Dave Chinner
2012-11-15 16:24   ` Christoph Hellwig
2012-11-12 11:53 ` [PATCH 05/32] xfs: remove xfs_flushinval_pages Dave Chinner
2012-11-15 16:28   ` Christoph Hellwig
2012-11-15 20:54     ` Dave Chinner
2012-11-21 10:12       ` Christoph Hellwig
2012-11-12 11:53 ` [PATCH 06/32] xfs: use btree block initialisation functions in growfs Dave Chinner
2012-11-13 21:18   ` Rich Johnston
2012-11-23 12:40   ` Christoph Hellwig
2012-11-23 21:25     ` Dave Chinner
2012-11-12 11:53 ` [PATCH 07/32] xfs: growfs: use uncached buffers for new headers Dave Chinner
2012-11-13 21:18   ` Rich Johnston
2012-11-12 11:54 ` [PATCH 08/32] xfs: make growfs initialise the AGFL header Dave Chinner
2012-11-13 21:18   ` Rich Johnston
2012-11-23 12:41   ` Christoph Hellwig
2012-11-23 21:27     ` Dave Chinner
2012-11-12 11:54 ` [PATCH 09/32] xfs: make buffer read verication an IO completion function Dave Chinner
2012-11-12 11:54 ` [PATCH 10/32] xfs: uncached buffer reads need to return an error Dave Chinner
2012-11-12 11:54 ` [PATCH 11/32] xfs: verify superblocks as they are read from disk Dave Chinner
2012-11-23 12:42   ` Christoph Hellwig
2012-11-12 11:54 ` [PATCH 12/32] xfs: verify AGF blocks " Dave Chinner
2012-11-13  1:09   ` Phil White
2012-11-13  3:07     ` Dave Chinner
2012-11-14  6:44   ` [PATCH 12/32 V2] " Dave Chinner
2012-11-14 21:28     ` Mark Tinguely
2012-11-12 11:54 ` [PATCH 13/32] xfs: verify AGI " Dave Chinner
2012-11-12 11:54 ` [PATCH 14/32] xfs: verify AGFL " Dave Chinner
2012-11-12 11:54 ` [PATCH 15/32] xfs: verify inode buffers " Dave Chinner
2012-11-12 11:54 ` [PATCH 16/32] xfs: verify btree blocks " Dave Chinner
2012-11-12 11:54 ` [PATCH 17/32] xfs: verify dquot " Dave Chinner
2012-11-14  6:50   ` [PATCH 17/32 V2] " Dave Chinner
2012-11-15 17:55     ` Mark Tinguely
2012-11-15 20:48       ` Dave Chinner
2012-11-15 21:01         ` Mark Tinguely
2012-11-15 21:16           ` Dave Chinner
2012-11-15 21:34             ` Mark Tinguely
2012-11-15 22:01               ` Dave Chinner
2012-11-15 22:09                 ` Dave Chinner
2012-11-15 22:26                 ` Mark Tinguely
2012-11-15 22:33                   ` Dave Chinner
2012-11-16  1:22                     ` Dave Chinner
2012-11-12 11:54 ` [PATCH 18/32] xfs: add verifier callback to directory read code Dave Chinner
2012-11-12 11:54 ` [PATCH 19/32] xfs: factor dir2 block read operations Dave Chinner
2012-11-15  3:09   ` Ben Myers
2012-11-15  5:59     ` Dave Chinner
2012-11-12 11:54 ` [PATCH 20/32] xfs: verify dir2 block format buffers Dave Chinner
2012-11-12 11:54 ` [PATCH 21/32] xfs: factor dir2 free block reading Dave Chinner
2012-11-12 11:54 ` [PATCH 22/32] xfs: factor out dir2 data " Dave Chinner
2012-11-12 11:54 ` [PATCH 23/32] xfs: factor dir2 leaf read Dave Chinner
2012-11-12 11:54 ` [PATCH 24/32] xfs: factor and verify attr leaf reads Dave Chinner
2012-11-12 11:54 ` [PATCH 25/32] xfs: add xfs_da_node verification Dave Chinner
2012-11-12 11:54 ` [PATCH 26/32] xfs: Add verifiers to dir2 data readahead Dave Chinner
2012-11-12 11:54 ` [PATCH 27/32] xfs: add buffer pre-write callback Dave Chinner
2012-11-15  6:02   ` [PATCH 27/32 REPOST] " Dave Chinner
2012-11-12 11:54 ` [PATCH 28/32] xfs: add pre-write metadata buffer verifier callbacks Dave Chinner
2012-11-14  6:52   ` [PATCH 28/32 V2] " Dave Chinner
2012-11-14 22:23     ` Mark Tinguely
2012-11-12 11:54 ` [PATCH 29/32] xfs: connect up write verifiers to new buffers Dave Chinner
2012-11-14  6:53   ` [PATCH 29/32 V2] " Dave Chinner
2012-11-12 11:54 ` [PATCH 30/32] xfs: convert buffer verifiers to an ops structure Dave Chinner
2012-11-14  6:54   ` [PATCH 30/32 V2] " Dave Chinner
2012-11-12 11:54 ` [PATCH 31/32] xfs: add CRC infrastructure Dave Chinner
2012-11-12 15:37   ` Mark Tinguely
2012-11-15 22:20   ` [PATCH 31/32 V2] " Dave Chinner
2012-11-12 11:54 ` [PATCH 32/32] xfs: add CRC checks to the log Dave Chinner
2012-11-12 15:37   ` Mark Tinguely
2012-11-13 23:26 ` [PATCH 00/32] xfs: current queue for 3.8 Ben Myers
2012-11-14  6:02   ` Dave Chinner
2012-11-14 20:42     ` Ben Myers
2012-11-14 21:27 ` Ben Myers
2012-11-15  4:40   ` Ben Myers
2012-11-15  6:03     ` Dave Chinner
2012-11-16  4:31       ` Ben Myers
2012-11-20  2:27 ` Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121121080502.GP2591@dastard \
    --to=david@fromorbit.com \
    --cc=adahl@sgi.com \
    --cc=tinguely@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox