From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: [PATCH 1/5] xfs: fix sub-page blocksize data integrity writes
Date: Fri, 17 May 2013 11:10:25 +1000 [thread overview]
Message-ID: <1368753029-9977-2-git-send-email-david@fromorbit.com> (raw)
In-Reply-To: <1368753029-9977-1-git-send-email-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
FSX on 512 byte block size filesystems has been failing for some
time with corrupted data. The fault dates back to the change in
the writeback data integrity algorithm that uses a mark-and-sweep
approach to avoid data writeback livelocks.
Unfortunately, a side effect of this mark-and-sweep approach is that
each page will only be written once for a data integrity sync, and
there is a condition in writeback in XFS where a page may require
two writeback attempts to be fully written. As a result of the high
level change, we now only get a partial page writeback during the
integrity sync because the first pass through writeback clears the
mark left on the page index to tell writeback that the page needs
writeback....
The cause is writing a partial page in the clustering code. This can
happen when a mapping boundary falls in the middle of a page - we
end up writing back the first part of the page that the mapping
covers, but then never revisit the page to have the remainder mapped
and written.
The fix is simple - if the mapping boundary falls inside a page,
then simple abort clustering without touching the page. This means
that the next ->writepage entry that write_cache_pages() will make
is the page we aborted on, and xfs_vm_writepage() will map all
sections of the page correctly. This behaviour is also optimal for
non-data integrity writes, as it results in contiguous sequential
writeback of the file rather than missing small holes and having to
write them a "random" writes in a future pass.
With this fix, all the fsx tests in xfstests now pass on a 512 byte
block size filesystem on a 4k page machine.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_aops.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 2b2691b..f04eceb 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -725,6 +725,25 @@ xfs_convert_page(
(xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT,
i_size_read(inode));
+ /*
+ * If the current map does not span the entire page we are about to try
+ * to write, then give up. The only way we can write a page that spans
+ * multiple mappings in a single writeback iteration is via the
+ * xfs_vm_writepage() function. Data integrity writeback requires the
+ * entire page to be written in a single attempt, otherwise the part of
+ * the page we don't write here doesn't get written as part of the data
+ * integrity sync.
+ *
+ * For normal writeback, we also don't attempt to write partial pages
+ * here as it simply means that write_cache_pages() will see it under
+ * writeback and ignore the page until some pointin the future, at which
+ * time this will be the only page inteh file that needs writeback.
+ * Hence for more optimal IO patterns, we should always avoid partial
+ * page writeback due to multiple mappings on a page here.
+ */
+ if (!xfs_imap_valid(inode, imap, end_offset))
+ goto fail_unlock_page;
+
len = 1 << inode->i_blkbits;
p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1),
PAGE_CACHE_SIZE);
--
1.7.10.4
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-05-17 1:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-17 1:10 [PATCH 0/5] xfs: fixes for 3.10-rc2 Dave Chinner
2013-05-17 1:10 ` Dave Chinner [this message]
2013-05-17 1:10 ` [PATCH 2/5] xfs: fix rounding in xfs_free_file_space Dave Chinner
2013-05-17 1:10 ` [PATCH 3/5] xfs: Don't reference the EFI after it is freed Dave Chinner
2013-05-17 1:10 ` [PATCH 4/5] xfs: avoid nesting transactions in xfs_qm_scall_setqlim() Dave Chinner
2013-05-17 1:10 ` [PATCH 5/5] xfs: fix missing KM_NOFS tags to keep lockdep happy Dave Chinner
2013-05-17 10:15 ` [PATCH 6/5] xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC Dave Chinner
2013-05-17 21:43 ` [PATCH 0/5] xfs: fixes for 3.10-rc2 Ben Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1368753029-9977-2-git-send-email-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox