* [PATCH 0/3] simplify the writeback code
@ 2010-06-11 16:13 Christoph Hellwig
2010-06-11 16:13 ` [PATCH 1/3] xfs: fix corruption case for block size < page size Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Christoph Hellwig @ 2010-06-11 16:13 UTC (permalink / raw)
To: xfs
The writeback path has a lot of code that's been dead for a while. We're
never asked to release dirty and thus delayed allocated or unwritten pages
from ->releasepage, and since the introduction of ->page_mkwrite we also
don't have to deal with dirty but unmapped buffers in ->writepage. But
there's a case masquerading as such, which needs to be fixed by a 1.5 years
old patch from Eric first. That patch also happens to fix xfstests 194.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/3] xfs: fix corruption case for block size < page size
2010-06-11 16:13 [PATCH 0/3] simplify the writeback code Christoph Hellwig
@ 2010-06-11 16:13 ` Christoph Hellwig
2010-06-15 1:21 ` Dave Chinner
2010-06-11 16:13 ` [PATCH 2/3] xfs: simplify xfs_vm_releasepage Christoph Hellwig
2010-06-11 16:13 ` [PATCH 3/3] xfs: simplify xfs_vm_writepage Christoph Hellwig
2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2010-06-11 16:13 UTC (permalink / raw)
To: xfs; +Cc: Eric Sandeen
[-- Attachment #1: xfs-fix-194 --]
[-- Type: text/plain, Size: 1655 bytes --]
From: Eric Sandeen <sandeen@sandeen.net>
xfstests 194 first truncats a file back and then extends it again by
truncating it to a larger size. This causes discard_buffer to drop
the mapped, but not the uptodate bit and thus creates something that
xfs_page_state_convert takes for unmapped space created by mmap because
it doesn't check for the dirty bit, which also gets cleared by
discard_buffer and checked by other ->writepage implementations like
block_write_full_page. Handle this kind of buffers early, and unlike
Eric's first version of the patch simply ASSERT that the buffers is
dirty, given that the mmap write case can't happen anymore since the
introduction of ->page_mkwrite. The now dead code dealing with that
will be deleted in a follow on patch.
Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: xfs/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_aops.c 2010-06-10 08:59:49.598004335 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_aops.c 2010-06-11 09:20:31.292254712 +0200
@@ -1125,6 +1125,16 @@ xfs_page_state_convert(
continue;
}
+ /*
+ * A hole may still be marked uptodate because discard_buffer
+ * leaves the flag set.
+ */
+ if (!buffer_mapped(bh) && buffer_uptodate(bh)) {
+ ASSERT(!buffer_dirty(bh));
+ imap_valid = 0;
+ continue;
+ }
+
if (imap_valid)
imap_valid = xfs_imap_valid(inode, &imap, offset);
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/3] xfs: simplify xfs_vm_releasepage
2010-06-11 16:13 [PATCH 0/3] simplify the writeback code Christoph Hellwig
2010-06-11 16:13 ` [PATCH 1/3] xfs: fix corruption case for block size < page size Christoph Hellwig
@ 2010-06-11 16:13 ` Christoph Hellwig
2010-06-15 1:37 ` Dave Chinner
2010-06-11 16:13 ` [PATCH 3/3] xfs: simplify xfs_vm_writepage Christoph Hellwig
2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2010-06-11 16:13 UTC (permalink / raw)
To: xfs
[-- Attachment #1: xfs-simplify-releasepage --]
[-- Type: text/plain, Size: 17056 bytes --]
Currently the xfs releasepage implementation has code to deal with converting
delayed allocated and unwritten space. But we never get called for those as
we always convert delayed and unwritten space when cleaning a page, or drop
the state from the buffers in block_invalidatepage. We still keep a WARN_ON
on those cases for now, but remove all the case dealing with it, which allows
to fold xfs_page_state_convert into xfs_vm_writepage and remove the !startio
case from the whole writeback path.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: xfs/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_aops.c 2010-06-11 09:22:13.864253873 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_aops.c 2010-06-11 09:27:27.602003840 +0200
@@ -754,7 +754,6 @@ xfs_convert_page(
struct xfs_bmbt_irec *imap,
xfs_ioend_t **ioendp,
struct writeback_control *wbc,
- int startio,
int all_bh)
{
struct buffer_head *bh, *head;
@@ -825,19 +824,14 @@ xfs_convert_page(
ASSERT(imap->br_startblock != DELAYSTARTBLOCK);
xfs_map_at_offset(inode, bh, imap, offset);
- if (startio) {
- xfs_add_to_ioend(inode, bh, offset,
- type, ioendp, done);
- } else {
- set_buffer_dirty(bh);
- unlock_buffer(bh);
- mark_buffer_dirty(bh);
- }
+ xfs_add_to_ioend(inode, bh, offset, type,
+ ioendp, done);
+
page_dirty--;
count++;
} else {
type = IO_NEW;
- if (buffer_mapped(bh) && all_bh && startio) {
+ if (buffer_mapped(bh) && all_bh) {
lock_buffer(bh);
xfs_add_to_ioend(inode, bh, offset,
type, ioendp, done);
@@ -852,14 +846,12 @@ xfs_convert_page(
if (uptodate && bh == head)
SetPageUptodate(page);
- if (startio) {
- if (count) {
- wbc->nr_to_write--;
- if (wbc->nr_to_write <= 0)
- done = 1;
- }
- xfs_start_page_writeback(page, !page_dirty, count);
+ if (count) {
+ wbc->nr_to_write--;
+ if (wbc->nr_to_write <= 0)
+ done = 1;
}
+ xfs_start_page_writeback(page, !page_dirty, count);
return done;
fail_unlock_page:
@@ -879,7 +871,6 @@ xfs_cluster_write(
struct xfs_bmbt_irec *imap,
xfs_ioend_t **ioendp,
struct writeback_control *wbc,
- int startio,
int all_bh,
pgoff_t tlast)
{
@@ -895,7 +886,7 @@ xfs_cluster_write(
for (i = 0; i < pagevec_count(&pvec); i++) {
done = xfs_convert_page(inode, pvec.pages[i], tindex++,
- imap, ioendp, wbc, startio, all_bh);
+ imap, ioendp, wbc, all_bh);
if (done)
break;
}
@@ -1025,51 +1016,101 @@ out_invalidate:
}
/*
- * Calling this without startio set means we are being asked to make a dirty
- * page ready for freeing it's buffers. When called with startio set then
- * we are coming from writepage.
+ * Write out a dirty page.
*
- * When called with startio set it is important that we write the WHOLE
- * page if possible.
- * The bh->b_state's cannot know if any of the blocks or which block for
- * that matter are dirty due to mmap writes, and therefore bh uptodate is
- * only valid if the page itself isn't completely uptodate. Some layers
- * may clear the page dirty flag prior to calling write page, under the
- * assumption the entire page will be written out; by not writing out the
- * whole page the page can be reused before all valid dirty data is
- * written out. Note: in the case of a page that has been dirty'd by
- * mapwrite and but partially setup by block_prepare_write the
- * bh->b_states's will not agree and only ones setup by BPW/BCW will have
- * valid state, thus the whole page must be written out thing.
+ * For delalloc space on the page we need to allocate space and flush it.
+ * For unwritten space on the page we need to start the conversion to
+ * regular allocated space.
+ * For unmapped buffer heads on the page we should allocate space if the
+ * page is uptodate.
+ * For any other dirty buffer heads on the page we should flush them.
+ *
+ * If we detect that a transaction would be required to flush the page, we
+ * have to check the process flags first, if we are already in a transaction
+ * or disk I/O during allocations is off, we need to fail the writepage and
+ * redirty the page.
+ *
+ * The bh->b_state's cannot know if any of the blocks or which block for that
+ * matter are dirty due to mmap writes, and therefore bh uptodate is only
+ * valid if the page itself isn't completely uptodate.
*/
-
STATIC int
-xfs_page_state_convert(
- struct inode *inode,
- struct page *page,
- struct writeback_control *wbc,
- int startio,
- int unmapped) /* also implies page uptodate */
+xfs_vm_writepage(
+ struct page *page,
+ struct writeback_control *wbc)
{
+ struct inode *inode = page->mapping->host;
+ int need_trans;
+ int delalloc, unmapped, unwritten;
struct buffer_head *bh, *head;
struct xfs_bmbt_irec imap;
xfs_ioend_t *ioend = NULL, *iohead = NULL;
loff_t offset;
- unsigned long p_offset = 0;
unsigned int type;
__uint64_t end_offset;
pgoff_t end_index, last_index;
ssize_t size, len;
int flags, err, imap_valid = 0, uptodate = 1;
- int page_dirty, count = 0;
- int trylock = 0;
- int all_bh = unmapped;
-
- if (startio) {
- if (wbc->sync_mode == WB_SYNC_NONE && wbc->nonblocking)
- trylock |= BMAPI_TRYLOCK;
+ int count = 0;
+ int all_bh;
+
+ trace_xfs_writepage(inode, page, 0);
+
+ /*
+ * Refuse to write the page out if we are called from reclaim context.
+ *
+ * This is primarily to avoid stack overflows when called from deep
+ * used stacks in random callers for direct reclaim, but disabling
+ * reclaim for kswap is a nice side-effect as kswapd causes rather
+ * suboptimal I/O patters, too.
+ *
+ * This should really be done by the core VM, but until that happens
+ * filesystems like XFS, btrfs and ext4 have to take care of this
+ * by themselves.
+ */
+ if (current->flags & PF_MEMALLOC)
+ goto out_fail;
+
+ /*
+ * We need a transaction if:
+ * 1. There are delalloc buffers on the page
+ * 2. The page is uptodate and we have unmapped buffers
+ * 3. The page is uptodate and we have no buffers
+ * 4. There are unwritten buffers on the page
+ */
+ if (!page_has_buffers(page)) {
+ unmapped = 1;
+ need_trans = 1;
+ } else {
+ xfs_count_page_state(page, &delalloc, &unmapped, &unwritten);
+ if (!PageUptodate(page))
+ unmapped = 0;
+ need_trans = delalloc + unmapped + unwritten;
}
+ /*
+ * If we need a transaction and the process flags say
+ * we are already in a transaction, or no IO is allowed
+ * then mark the page dirty again and leave the page
+ * as is.
+ */
+ if (current_test_flags(PF_FSTRANS) && need_trans)
+ goto out_fail;
+
+ /*
+ * Delay hooking up buffer heads until we have
+ * made our go/no-go decision.
+ */
+ if (!page_has_buffers(page))
+ create_empty_buffers(page, 1 << inode->i_blkbits, 0);
+
+ /*
+ * VM calculation for nr_to_write seems off. Bump it way
+ * up, this gets simple streaming writes zippy again.
+ * To be reviewed again after Jens' writeback changes.
+ */
+ wbc->nr_to_write *= 4;
+
/* Is this page beyond the end of the file? */
offset = i_size_read(inode);
end_index = offset >> PAGE_CACHE_SHIFT;
@@ -1077,53 +1118,27 @@ xfs_page_state_convert(
if (page->index >= end_index) {
if ((page->index >= end_index + 1) ||
!(i_size_read(inode) & (PAGE_CACHE_SIZE - 1))) {
- if (startio)
- unlock_page(page);
+ unlock_page(page);
return 0;
}
}
- /*
- * page_dirty is initially a count of buffers on the page before
- * EOF and is decremented as we move each into a cleanable state.
- *
- * Derivation:
- *
- * End offset is the highest offset that this page should represent.
- * If we are on the last page, (end_offset & (PAGE_CACHE_SIZE - 1))
- * will evaluate non-zero and be less than PAGE_CACHE_SIZE and
- * hence give us the correct page_dirty count. On any other page,
- * it will be zero and in that case we need page_dirty to be the
- * count of buffers on the page.
- */
end_offset = min_t(unsigned long long,
(xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT, offset);
len = 1 << inode->i_blkbits;
- p_offset = min_t(unsigned long, end_offset & (PAGE_CACHE_SIZE - 1),
- PAGE_CACHE_SIZE);
- p_offset = p_offset ? roundup(p_offset, len) : PAGE_CACHE_SIZE;
- page_dirty = p_offset / len;
bh = head = page_buffers(page);
offset = page_offset(page);
flags = BMAPI_READ;
type = IO_NEW;
- /* TODO: cleanup count and page_dirty */
+ all_bh = unmapped;
do {
if (offset >= end_offset)
break;
if (!buffer_uptodate(bh))
uptodate = 0;
- if (!(PageUptodate(page) || buffer_uptodate(bh)) && !startio) {
- /*
- * the iomap is actually still valid, but the ioend
- * isn't. shouldn't happen too often.
- */
- imap_valid = 0;
- continue;
- }
/*
* A hole may still be marked uptodate because discard_buffer
@@ -1150,7 +1165,7 @@ xfs_page_state_convert(
*/
if (buffer_unwritten(bh) || buffer_delay(bh) ||
((buffer_uptodate(bh) || PageUptodate(page)) &&
- !buffer_mapped(bh) && (unmapped || startio))) {
+ !buffer_mapped(bh))) {
int new_ioend = 0;
/*
@@ -1164,7 +1179,11 @@ xfs_page_state_convert(
flags = BMAPI_WRITE | BMAPI_IGNSTATE;
} else if (buffer_delay(bh)) {
type = IO_DELAY;
- flags = BMAPI_ALLOCATE | trylock;
+ flags = BMAPI_ALLOCATE;
+
+ if (wbc->sync_mode == WB_SYNC_NONE &&
+ wbc->nonblocking)
+ flags |= BMAPI_TRYLOCK;
} else {
type = IO_NEW;
flags = BMAPI_WRITE | BMAPI_MMAP;
@@ -1196,19 +1215,11 @@ xfs_page_state_convert(
}
if (imap_valid) {
xfs_map_at_offset(inode, bh, &imap, offset);
- if (startio) {
- xfs_add_to_ioend(inode, bh, offset,
- type, &ioend,
- new_ioend);
- } else {
- set_buffer_dirty(bh);
- unlock_buffer(bh);
- mark_buffer_dirty(bh);
- }
- page_dirty--;
+ xfs_add_to_ioend(inode, bh, offset, type,
+ &ioend, new_ioend);
count++;
}
- } else if (buffer_uptodate(bh) && startio) {
+ } else if (buffer_uptodate(bh)) {
/*
* we got here because the buffer is already mapped.
* That means it must already have extents allocated
@@ -1241,13 +1252,11 @@ xfs_page_state_convert(
all_bh = 1;
xfs_add_to_ioend(inode, bh, offset, type,
&ioend, !imap_valid);
- page_dirty--;
count++;
} else {
imap_valid = 0;
}
- } else if ((buffer_uptodate(bh) || PageUptodate(page)) &&
- (unmapped || startio)) {
+ } else if (PageUptodate(page)) {
imap_valid = 0;
}
@@ -1259,8 +1268,7 @@ xfs_page_state_convert(
if (uptodate && bh == head)
SetPageUptodate(page);
- if (startio)
- xfs_start_page_writeback(page, 1, count);
+ xfs_start_page_writeback(page, 1, count);
if (ioend && imap_valid) {
xfs_off_t end_index;
@@ -1278,139 +1286,28 @@ xfs_page_state_convert(
end_index = last_index;
xfs_cluster_write(inode, page->index + 1, &imap, &ioend,
- wbc, startio, all_bh, end_index);
+ wbc, all_bh, end_index);
}
if (iohead)
xfs_submit_ioend(wbc, iohead);
- return page_dirty;
+ return 0;
error:
if (iohead)
xfs_cancel_ioend(iohead);
- /*
- * If it's delalloc and we have nowhere to put it,
- * throw it away, unless the lower layers told
- * us to try again.
- */
- if (err != -EAGAIN) {
- if (!unmapped)
- xfs_aops_discard_page(page);
- ClearPageUptodate(page);
- }
+ if (!unmapped)
+ xfs_aops_discard_page(page);
+ ClearPageUptodate(page);
+ unlock_page(page);
return err;
-}
-
-/*
- * writepage: Called from one of two places:
- *
- * 1. we are flushing a delalloc buffer head.
- *
- * 2. we are writing out a dirty page. Typically the page dirty
- * state is cleared before we get here. In this case is it
- * conceivable we have no buffer heads.
- *
- * For delalloc space on the page we need to allocate space and
- * flush it. For unmapped buffer heads on the page we should
- * allocate space if the page is uptodate. For any other dirty
- * buffer heads on the page we should flush them.
- *
- * If we detect that a transaction would be required to flush
- * the page, we have to check the process flags first, if we
- * are already in a transaction or disk I/O during allocations
- * is off, we need to fail the writepage and redirty the page.
- */
-
-STATIC int
-xfs_vm_writepage(
- struct page *page,
- struct writeback_control *wbc)
-{
- int error;
- int need_trans;
- int delalloc, unmapped, unwritten;
- struct inode *inode = page->mapping->host;
-
- trace_xfs_writepage(inode, page, 0);
-
- /*
- * Refuse to write the page out if we are called from reclaim context.
- *
- * This is primarily to avoid stack overflows when called from deep
- * used stacks in random callers for direct reclaim, but disabling
- * reclaim for kswap is a nice side-effect as kswapd causes rather
- * suboptimal I/O patters, too.
- *
- * This should really be done by the core VM, but until that happens
- * filesystems like XFS, btrfs and ext4 have to take care of this
- * by themselves.
- */
- if (current->flags & PF_MEMALLOC)
- goto out_fail;
-
- /*
- * We need a transaction if:
- * 1. There are delalloc buffers on the page
- * 2. The page is uptodate and we have unmapped buffers
- * 3. The page is uptodate and we have no buffers
- * 4. There are unwritten buffers on the page
- */
-
- if (!page_has_buffers(page)) {
- unmapped = 1;
- need_trans = 1;
- } else {
- xfs_count_page_state(page, &delalloc, &unmapped, &unwritten);
- if (!PageUptodate(page))
- unmapped = 0;
- need_trans = delalloc + unmapped + unwritten;
- }
-
- /*
- * If we need a transaction and the process flags say
- * we are already in a transaction, or no IO is allowed
- * then mark the page dirty again and leave the page
- * as is.
- */
- if (current_test_flags(PF_FSTRANS) && need_trans)
- goto out_fail;
-
- /*
- * Delay hooking up buffer heads until we have
- * made our go/no-go decision.
- */
- if (!page_has_buffers(page))
- create_empty_buffers(page, 1 << inode->i_blkbits, 0);
-
-
- /*
- * VM calculation for nr_to_write seems off. Bump it way
- * up, this gets simple streaming writes zippy again.
- * To be reviewed again after Jens' writeback changes.
- */
- wbc->nr_to_write *= 4;
-
- /*
- * Convert delayed allocate, unwritten or unmapped space
- * to real space and flush out to disk.
- */
- error = xfs_page_state_convert(inode, page, wbc, 1, unmapped);
- if (error == -EAGAIN)
- goto out_fail;
- if (unlikely(error < 0))
- goto out_unlock;
-
- return 0;
out_fail:
redirty_page_for_writepage(wbc, page);
unlock_page(page);
return 0;
-out_unlock:
- unlock_page(page);
- return error;
}
STATIC int
@@ -1424,65 +1321,27 @@ xfs_vm_writepages(
/*
* Called to move a page into cleanable state - and from there
- * to be released. Possibly the page is already clean. We always
+ * to be released. The page should already be clean. We always
* have buffer heads in this call.
*
- * Returns 0 if the page is ok to release, 1 otherwise.
- *
- * Possible scenarios are:
- *
- * 1. We are being called to release a page which has been written
- * to via regular I/O. buffer heads will be dirty and possibly
- * delalloc. If no delalloc buffer heads in this case then we
- * can just return zero.
- *
- * 2. We are called to release a page which has been written via
- * mmap, all we need to do is ensure there is no delalloc
- * state in the buffer heads, if not we can let the caller
- * free them and we should come back later via writepage.
+ * Returns 1 if the page is ok to release, 0 otherwise.
*/
STATIC int
xfs_vm_releasepage(
struct page *page,
gfp_t gfp_mask)
{
- struct inode *inode = page->mapping->host;
- int dirty, delalloc, unmapped, unwritten;
- struct writeback_control wbc = {
- .sync_mode = WB_SYNC_ALL,
- .nr_to_write = 1,
- };
-
- trace_xfs_releasepage(inode, page, 0);
+ int delalloc, unmapped, unwritten;
- if (!page_has_buffers(page))
- return 0;
+ trace_xfs_releasepage(page->mapping->host, page, 0);
xfs_count_page_state(page, &delalloc, &unmapped, &unwritten);
- if (!delalloc && !unwritten)
- goto free_buffers;
- if (!(gfp_mask & __GFP_FS))
+ if (WARN_ON(delalloc))
return 0;
-
- /* If we are already inside a transaction or the thread cannot
- * do I/O, we cannot release this page.
- */
- if (current_test_flags(PF_FSTRANS))
+ if (WARN_ON(unwritten))
return 0;
- /*
- * Convert delalloc space to real space, do not flush the
- * data out to disk, that will be done by the caller.
- * Never need to allocate space here - we will always
- * come back to writepage in that case.
- */
- dirty = xfs_page_state_convert(inode, page, &wbc, 0, 0);
- if (dirty == 0 && !unwritten)
- goto free_buffers;
- return 0;
-
-free_buffers:
return try_to_free_buffers(page);
}
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 3/3] xfs: simplify xfs_vm_writepage
2010-06-11 16:13 [PATCH 0/3] simplify the writeback code Christoph Hellwig
2010-06-11 16:13 ` [PATCH 1/3] xfs: fix corruption case for block size < page size Christoph Hellwig
2010-06-11 16:13 ` [PATCH 2/3] xfs: simplify xfs_vm_releasepage Christoph Hellwig
@ 2010-06-11 16:13 ` Christoph Hellwig
2010-06-15 1:50 ` Dave Chinner
2 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2010-06-11 16:13 UTC (permalink / raw)
To: xfs
[-- Attachment #1: xfs-simplify-writepage --]
[-- Type: text/plain, Size: 11740 bytes --]
The writepage implementation in XFS still tries to deal with dirty but
unmapped buffers which used to caused by writes through shared mmaps. Since
the introduction of ->page_mkwrite these can't happen anymore, so remove the
code dealing with them.
Note that the all_bh variable which causes us to start I/O on all buffers on
the pages was controlled by the count of unmapped buffers, which also
included those not actually dirty. It's now unconditionally initialized to
0 but set to 1 for the case of small file size extensions. It probably can
be removed entirely, but that's left for another patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Index: xfs/fs/xfs/linux-2.6/xfs_aops.c
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_aops.c 2010-06-11 09:27:27.602003840 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_aops.c 2010-06-11 09:28:45.408003978 +0200
@@ -85,18 +85,15 @@ void
xfs_count_page_state(
struct page *page,
int *delalloc,
- int *unmapped,
int *unwritten)
{
struct buffer_head *bh, *head;
- *delalloc = *unmapped = *unwritten = 0;
+ *delalloc = *unwritten = 0;
bh = head = page_buffers(page);
do {
- if (buffer_uptodate(bh) && !buffer_mapped(bh))
- (*unmapped) = 1;
- else if (buffer_unwritten(bh))
+ if (buffer_unwritten(bh))
(*unwritten) = 1;
else if (buffer_delay(bh))
(*delalloc) = 1;
@@ -607,31 +604,30 @@ xfs_map_at_offset(
STATIC unsigned int
xfs_probe_page(
struct page *page,
- unsigned int pg_offset,
- int mapped)
+ unsigned int pg_offset)
{
+ struct buffer_head *bh, *head;
int ret = 0;
if (PageWriteback(page))
return 0;
+ if (!PageDirty(page))
+ return 0;
+ if (!page->mapping)
+ return 0;
+ if (!page_has_buffers(page))
+ return 0;
- if (page->mapping && PageDirty(page)) {
- if (page_has_buffers(page)) {
- struct buffer_head *bh, *head;
-
- bh = head = page_buffers(page);
- do {
- if (!buffer_uptodate(bh))
- break;
- if (mapped != buffer_mapped(bh))
- break;
- ret += bh->b_size;
- if (ret >= pg_offset)
- break;
- } while ((bh = bh->b_this_page) != head);
- } else
- ret = mapped ? 0 : PAGE_CACHE_SIZE;
- }
+ bh = head = page_buffers(page);
+ do {
+ if (!buffer_uptodate(bh))
+ break;
+ if (!buffer_mapped(bh))
+ break;
+ ret += bh->b_size;
+ if (ret >= pg_offset)
+ break;
+ } while ((bh = bh->b_this_page) != head);
return ret;
}
@@ -641,8 +637,7 @@ xfs_probe_cluster(
struct inode *inode,
struct page *startpage,
struct buffer_head *bh,
- struct buffer_head *head,
- int mapped)
+ struct buffer_head *head)
{
struct pagevec pvec;
pgoff_t tindex, tlast, tloff;
@@ -651,7 +646,7 @@ xfs_probe_cluster(
/* First sum forwards in this page */
do {
- if (!buffer_uptodate(bh) || (mapped != buffer_mapped(bh)))
+ if (!buffer_uptodate(bh) || !buffer_mapped(bh))
return total;
total += bh->b_size;
} while ((bh = bh->b_this_page) != head);
@@ -685,7 +680,7 @@ xfs_probe_cluster(
pg_offset = PAGE_CACHE_SIZE;
if (page->index == tindex && trylock_page(page)) {
- pg_len = xfs_probe_page(page, pg_offset, mapped);
+ pg_len = xfs_probe_page(page, pg_offset);
unlock_page(page);
}
@@ -1021,18 +1016,12 @@ out_invalidate:
* For delalloc space on the page we need to allocate space and flush it.
* For unwritten space on the page we need to start the conversion to
* regular allocated space.
- * For unmapped buffer heads on the page we should allocate space if the
- * page is uptodate.
* For any other dirty buffer heads on the page we should flush them.
*
* If we detect that a transaction would be required to flush the page, we
* have to check the process flags first, if we are already in a transaction
* or disk I/O during allocations is off, we need to fail the writepage and
* redirty the page.
- *
- * The bh->b_state's cannot know if any of the blocks or which block for that
- * matter are dirty due to mmap writes, and therefore bh uptodate is only
- * valid if the page itself isn't completely uptodate.
*/
STATIC int
xfs_vm_writepage(
@@ -1040,8 +1029,7 @@ xfs_vm_writepage(
struct writeback_control *wbc)
{
struct inode *inode = page->mapping->host;
- int need_trans;
- int delalloc, unmapped, unwritten;
+ int delalloc, unwritten;
struct buffer_head *bh, *head;
struct xfs_bmbt_irec imap;
xfs_ioend_t *ioend = NULL, *iohead = NULL;
@@ -1052,10 +1040,12 @@ xfs_vm_writepage(
ssize_t size, len;
int flags, err, imap_valid = 0, uptodate = 1;
int count = 0;
- int all_bh;
+ int all_bh = 0;
trace_xfs_writepage(inode, page, 0);
+ ASSERT(page_has_buffers(page));
+
/*
* Refuse to write the page out if we are called from reclaim context.
*
@@ -1072,39 +1062,18 @@ xfs_vm_writepage(
goto out_fail;
/*
- * We need a transaction if:
- * 1. There are delalloc buffers on the page
- * 2. The page is uptodate and we have unmapped buffers
- * 3. The page is uptodate and we have no buffers
- * 4. There are unwritten buffers on the page
- */
- if (!page_has_buffers(page)) {
- unmapped = 1;
- need_trans = 1;
- } else {
- xfs_count_page_state(page, &delalloc, &unmapped, &unwritten);
- if (!PageUptodate(page))
- unmapped = 0;
- need_trans = delalloc + unmapped + unwritten;
- }
-
- /*
- * If we need a transaction and the process flags say
- * we are already in a transaction, or no IO is allowed
- * then mark the page dirty again and leave the page
- * as is.
+ * We need a transaction if there are delalloc or unwritten buffers
+ * on the page.
+ *
+ * If we need a transaction and the process flags say we are already
+ * in a transaction, or no IO is allowed then mark the page dirty
+ * again and leave the page as is.
*/
- if (current_test_flags(PF_FSTRANS) && need_trans)
+ xfs_count_page_state(page, &delalloc, &unwritten);
+ if ((current->flags & PF_FSTRANS) && (delalloc || unwritten))
goto out_fail;
/*
- * Delay hooking up buffer heads until we have
- * made our go/no-go decision.
- */
- if (!page_has_buffers(page))
- create_empty_buffers(page, 1 << inode->i_blkbits, 0);
-
- /*
* VM calculation for nr_to_write seems off. Bump it way
* up, this gets simple streaming writes zippy again.
* To be reviewed again after Jens' writeback changes.
@@ -1124,7 +1093,8 @@ xfs_vm_writepage(
}
end_offset = min_t(unsigned long long,
- (xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT, offset);
+ (xfs_off_t)(page->index + 1) << PAGE_CACHE_SHIFT,
+ offset);
len = 1 << inode->i_blkbits;
bh = head = page_buffers(page);
@@ -1132,8 +1102,6 @@ xfs_vm_writepage(
flags = BMAPI_READ;
type = IO_NEW;
- all_bh = unmapped;
-
do {
if (offset >= end_offset)
break;
@@ -1153,19 +1121,7 @@ xfs_vm_writepage(
if (imap_valid)
imap_valid = xfs_imap_valid(inode, &imap, offset);
- /*
- * First case, map an unwritten extent and prepare for
- * extent state conversion transaction on completion.
- *
- * Second case, allocate space for a delalloc buffer.
- * We can return EAGAIN here in the release page case.
- *
- * Third case, an unmapped buffer was found, and we are
- * in a path where we need to write the whole page out.
- */
- if (buffer_unwritten(bh) || buffer_delay(bh) ||
- ((buffer_uptodate(bh) || PageUptodate(page)) &&
- !buffer_mapped(bh))) {
+ if (buffer_unwritten(bh) || buffer_delay(bh)) {
int new_ioend = 0;
/*
@@ -1184,14 +1140,11 @@ xfs_vm_writepage(
if (wbc->sync_mode == WB_SYNC_NONE &&
wbc->nonblocking)
flags |= BMAPI_TRYLOCK;
- } else {
- type = IO_NEW;
- flags = BMAPI_WRITE | BMAPI_MMAP;
}
if (!imap_valid) {
/*
- * if we didn't have a valid mapping then we
+ * If we didn't have a valid mapping then we
* need to ensure that we put the new mapping
* in a new ioend structure. This needs to be
* done to ensure that the ioends correctly
@@ -1199,14 +1152,7 @@ xfs_vm_writepage(
* for unwritten extent conversion.
*/
new_ioend = 1;
- if (type == IO_NEW) {
- size = xfs_probe_cluster(inode,
- page, bh, head, 0);
- } else {
- size = len;
- }
-
- err = xfs_map_blocks(inode, offset, size,
+ err = xfs_map_blocks(inode, offset, len,
&imap, flags);
if (err)
goto error;
@@ -1227,8 +1173,7 @@ xfs_vm_writepage(
*/
if (!imap_valid || flags != BMAPI_READ) {
flags = BMAPI_READ;
- size = xfs_probe_cluster(inode, page, bh,
- head, 1);
+ size = xfs_probe_cluster(inode, page, bh, head);
err = xfs_map_blocks(inode, offset, size,
&imap, flags);
if (err)
@@ -1247,7 +1192,6 @@ xfs_vm_writepage(
*/
type = IO_NEW;
if (trylock_buffer(bh)) {
- ASSERT(buffer_mapped(bh));
if (imap_valid)
all_bh = 1;
xfs_add_to_ioend(inode, bh, offset, type,
@@ -1257,6 +1201,7 @@ xfs_vm_writepage(
imap_valid = 0;
}
} else if (PageUptodate(page)) {
+ ASSERT(buffer_mapped(bh));
imap_valid = 0;
}
@@ -1298,8 +1243,7 @@ error:
if (iohead)
xfs_cancel_ioend(iohead);
- if (!unmapped)
- xfs_aops_discard_page(page);
+ xfs_aops_discard_page(page);
ClearPageUptodate(page);
unlock_page(page);
return err;
@@ -1331,11 +1275,11 @@ xfs_vm_releasepage(
struct page *page,
gfp_t gfp_mask)
{
- int delalloc, unmapped, unwritten;
+ int delalloc, unwritten;
trace_xfs_releasepage(page->mapping->host, page, 0);
- xfs_count_page_state(page, &delalloc, &unmapped, &unwritten);
+ xfs_count_page_state(page, &delalloc, &unwritten);
if (WARN_ON(delalloc))
return 0;
Index: xfs/fs/xfs/linux-2.6/xfs_aops.h
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_aops.h 2010-02-13 22:45:45.000000000 +0100
+++ xfs/fs/xfs/linux-2.6/xfs_aops.h 2010-06-11 09:27:44.075004398 +0200
@@ -45,6 +45,6 @@ extern int xfs_get_blocks(struct inode *
extern void xfs_ioend_init(void);
extern void xfs_ioend_wait(struct xfs_inode *);
-extern void xfs_count_page_state(struct page *, int *, int *, int *);
+extern void xfs_count_page_state(struct page *, int *, int *);
#endif /* __XFS_AOPS_H__ */
Index: xfs/fs/xfs/linux-2.6/xfs_trace.h
===================================================================
--- xfs.orig/fs/xfs/linux-2.6/xfs_trace.h 2010-06-04 15:12:52.000000000 +0200
+++ xfs/fs/xfs/linux-2.6/xfs_trace.h 2010-06-11 09:27:44.083003979 +0200
@@ -829,33 +829,29 @@ DECLARE_EVENT_CLASS(xfs_page_class,
__field(loff_t, size)
__field(unsigned long, offset)
__field(int, delalloc)
- __field(int, unmapped)
__field(int, unwritten)
),
TP_fast_assign(
- int delalloc = -1, unmapped = -1, unwritten = -1;
+ int delalloc = -1, unwritten = -1;
if (page_has_buffers(page))
- xfs_count_page_state(page, &delalloc,
- &unmapped, &unwritten);
+ xfs_count_page_state(page, &delalloc, &unwritten);
__entry->dev = inode->i_sb->s_dev;
__entry->ino = XFS_I(inode)->i_ino;
__entry->pgoff = page_offset(page);
__entry->size = i_size_read(inode);
__entry->offset = off;
__entry->delalloc = delalloc;
- __entry->unmapped = unmapped;
__entry->unwritten = unwritten;
),
TP_printk("dev %d:%d ino 0x%llx pgoff 0x%lx size 0x%llx offset %lx "
- "delalloc %d unmapped %d unwritten %d",
+ "delalloc %d unwritten %d",
MAJOR(__entry->dev), MINOR(__entry->dev),
__entry->ino,
__entry->pgoff,
__entry->size,
__entry->offset,
__entry->delalloc,
- __entry->unmapped,
__entry->unwritten)
)
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/3] xfs: fix corruption case for block size < page size
2010-06-11 16:13 ` [PATCH 1/3] xfs: fix corruption case for block size < page size Christoph Hellwig
@ 2010-06-15 1:21 ` Dave Chinner
0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2010-06-15 1:21 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Eric Sandeen, xfs
On Fri, Jun 11, 2010 at 12:13:13PM -0400, Christoph Hellwig wrote:
> From: Eric Sandeen <sandeen@sandeen.net>
>
> xfstests 194 first truncats a file back and then extends it again by
> truncating it to a larger size. This causes discard_buffer to drop
> the mapped, but not the uptodate bit and thus creates something that
> xfs_page_state_convert takes for unmapped space created by mmap because
> it doesn't check for the dirty bit, which also gets cleared by
> discard_buffer and checked by other ->writepage implementations like
> block_write_full_page. Handle this kind of buffers early, and unlike
> Eric's first version of the patch simply ASSERT that the buffers is
> dirty, given that the mmap write case can't happen anymore since the
> introduction of ->page_mkwrite. The now dead code dealing with that
> will be deleted in a follow on patch.
>
> Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] xfs: simplify xfs_vm_releasepage
2010-06-11 16:13 ` [PATCH 2/3] xfs: simplify xfs_vm_releasepage Christoph Hellwig
@ 2010-06-15 1:37 ` Dave Chinner
2010-06-15 6:22 ` Christoph Hellwig
0 siblings, 1 reply; 8+ messages in thread
From: Dave Chinner @ 2010-06-15 1:37 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Fri, Jun 11, 2010 at 12:13:14PM -0400, Christoph Hellwig wrote:
> Currently the xfs releasepage implementation has code to deal with converting
> delayed allocated and unwritten space. But we never get called for those as
> we always convert delayed and unwritten space when cleaning a page, or drop
> the state from the buffers in block_invalidatepage. We still keep a WARN_ON
> on those cases for now, but remove all the case dealing with it, which allows
> to fold xfs_page_state_convert into xfs_vm_writepage and remove the !startio
> case from the whole writeback path.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good. Couple of things:
> +
> + /*
> + * VM calculation for nr_to_write seems off. Bump it way
> + * up, this gets simple streaming writes zippy again.
> + * To be reviewed again after Jens' writeback changes.
> + */
> + wbc->nr_to_write *= 4;
> +
That's gone in mainline as of as of .35-rc3. We need to get the
xfs-dev tree updated.
Otherwise, conѕider it:
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 3/3] xfs: simplify xfs_vm_writepage
2010-06-11 16:13 ` [PATCH 3/3] xfs: simplify xfs_vm_writepage Christoph Hellwig
@ 2010-06-15 1:50 ` Dave Chinner
0 siblings, 0 replies; 8+ messages in thread
From: Dave Chinner @ 2010-06-15 1:50 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: xfs
On Fri, Jun 11, 2010 at 12:13:15PM -0400, Christoph Hellwig wrote:
> The writepage implementation in XFS still tries to deal with dirty but
> unmapped buffers which used to caused by writes through shared mmaps. Since
> the introduction of ->page_mkwrite these can't happen anymore, so remove the
> code dealing with them.
>
> Note that the all_bh variable which causes us to start I/O on all buffers on
> the pages was controlled by the count of unmapped buffers, which also
> included those not actually dirty. It's now unconditionally initialized to
> 0 but set to 1 for the case of small file size extensions. It probably can
> be removed entirely, but that's left for another patch.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
Looks good.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/3] xfs: simplify xfs_vm_releasepage
2010-06-15 1:37 ` Dave Chinner
@ 2010-06-15 6:22 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2010-06-15 6:22 UTC (permalink / raw)
To: Dave Chinner; +Cc: Christoph Hellwig, xfs
On Tue, Jun 15, 2010 at 11:37:37AM +1000, Dave Chinner wrote:
> > + /*
> > + * VM calculation for nr_to_write seems off. Bump it way
> > + * up, this gets simple streaming writes zippy again.
> > + * To be reviewed again after Jens' writeback changes.
> > + */
> > + wbc->nr_to_write *= 4;
> > +
>
> That's gone in mainline as of as of .35-rc3. We need to get the
> xfs-dev tree updated.
It's still in the XFS tree. Either we need to rebase that one, or sort
it out during a merge.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-06-15 6:19 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-11 16:13 [PATCH 0/3] simplify the writeback code Christoph Hellwig
2010-06-11 16:13 ` [PATCH 1/3] xfs: fix corruption case for block size < page size Christoph Hellwig
2010-06-15 1:21 ` Dave Chinner
2010-06-11 16:13 ` [PATCH 2/3] xfs: simplify xfs_vm_releasepage Christoph Hellwig
2010-06-15 1:37 ` Dave Chinner
2010-06-15 6:22 ` Christoph Hellwig
2010-06-11 16:13 ` [PATCH 3/3] xfs: simplify xfs_vm_writepage Christoph Hellwig
2010-06-15 1:50 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox