* [PATCH 1/4] mm: vmscan: Block kswapd if it is encountering pages under writeback -fix
2013-05-27 13:02 [PATCH 0/2] Reduce system disruption due to kswapd followup Mel Gorman
@ 2013-05-27 13:02 ` Mel Gorman
2013-05-27 13:02 ` [PATCH 2/4] mm: vmscan: Stall page reclaim and writeback pages based on dirty/writepage pages encountered Mel Gorman
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2013-05-27 13:02 UTC (permalink / raw)
To: Andrew Morton
Cc: Jiri Slaby, Valdis Kletnieks, Rik van Riel, Zlatko Calusic,
Johannes Weiner, dormando, Michal Hocko, Jan Kara, Dave Chinner,
Kamezawa Hiroyuki, Linux-FSDevel, Linux-MM, LKML, Mel Gorman
The patch "mm: vmscan: Block kswapd if it is encountering pages
under writeback" stalls in congestion_wait it encounters a page under
writeback that is marked for immediate reclaim. Initially this was a
wait_on_page_writeback() but after the switch to congestion_wait(),
there is no guarantee the page has completed writeback and it can
be placed on a list for freeing.
This is a fix for
mm-vmscan-block-kswapd-if-it-is-encountering-pages-under-writeback.patch
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/vmscan.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b1b38ad..4a43c28 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -766,8 +766,10 @@ static unsigned long shrink_page_list(struct list_head *page_list,
if (current_is_kswapd() &&
PageReclaim(page) &&
zone_is_reclaim_writeback(zone)) {
+ unlock_page(page);
congestion_wait(BLK_RW_ASYNC, HZ/10);
zone_clear_flag(zone, ZONE_WRITEBACK);
+ goto keep;
/* Case 2 above */
} else if (global_reclaim(sc) ||
--
1.8.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/4] mm: vmscan: Stall page reclaim and writeback pages based on dirty/writepage pages encountered
2013-05-27 13:02 [PATCH 0/2] Reduce system disruption due to kswapd followup Mel Gorman
2013-05-27 13:02 ` [PATCH 1/4] mm: vmscan: Block kswapd if it is encountering pages under writeback -fix Mel Gorman
@ 2013-05-27 13:02 ` Mel Gorman
2013-05-27 13:02 ` [PATCH 3/4] mm: vmscan: Stall page reclaim after a list of pages have been processed Mel Gorman
2013-05-27 13:02 ` [PATCH 4/4] mm: vmscan: Take page buffers dirty and locked state into account Mel Gorman
3 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2013-05-27 13:02 UTC (permalink / raw)
To: Andrew Morton
Cc: Jiri Slaby, Valdis Kletnieks, Rik van Riel, Zlatko Calusic,
Johannes Weiner, dormando, Michal Hocko, Jan Kara, Dave Chinner,
Kamezawa Hiroyuki, Linux-FSDevel, Linux-MM, LKML, Mel Gorman
The patch "mm: vmscan: Have kswapd writeback pages based on dirty pages
encountered, not priority" decides whether to writeback pages from reclaim
context based on the number of dirty pages encountered. This situation
is flagged too easily and flushers are not given the chance to catch up
resulting in more pages being written from reclaim context and potentially
impacting IO performance. The check for PageWriteback is also misplaced
as it happens within a PageDirty check which is nonsense as the dirty may
have been cleared for IO. The accounting is updated very late and pages
that are already under writeback, were reactivated, could not unmapped or
could not be released are all missed. Finally, it considers stalling and
writing back filesystem pages due to encountering dirty anonymous pages
at the tail of the LRU which is dumb.
This patch causes kswapd to begin writing filesystem pages from reclaim
context only if page reclaim found that all filesystem pages at the tail of
the LRU were unqueued dirty pages. Before it starts writing filesystem pages,
it will stall to give flushers a chance to catch up. The decision on whether
wait_iff_congested is also now determined by dirty filesystem pages only.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/vmscan.c | 52 ++++++++++++++++++++++++++++++++++++++++++----------
1 file changed, 42 insertions(+), 10 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4a43c28..be8e445 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -669,6 +669,27 @@ static enum page_references page_check_references(struct page *page,
return PAGEREF_RECLAIM;
}
+/* Check if a page is dirty or under writeback */
+static void page_check_dirty_writeback(struct page *page,
+ bool *dirty, bool *writeback)
+{
+ struct address_space *mapping;
+
+ /*
+ * Anonymous pages are not handled by flushers and must be written
+ * from reclaim context. Do not stall reclaim based on them
+ */
+ if (!page_is_file_cache(page)) {
+ *dirty = false;
+ *writeback = false;
+ return;
+ }
+
+ /* By default assume that the page flags are accurate */
+ *dirty = PageDirty(page);
+ *writeback = PageWriteback(page);
+}
+
/*
* shrink_page_list() returns the number of reclaimed pages
*/
@@ -697,6 +718,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
struct page *page;
int may_enter_fs;
enum page_references references = PAGEREF_RECLAIM_CLEAN;
+ bool dirty, writeback;
cond_resched();
@@ -725,6 +747,19 @@ static unsigned long shrink_page_list(struct list_head *page_list,
(PageSwapCache(page) && (sc->gfp_mask & __GFP_IO));
/*
+ * The number of dirty pages determines if a zone is marked
+ * reclaim_congested which affects wait_iff_congested. kswapd
+ * will stall and start writing pages if the tail of the LRU
+ * is all dirty unqueued pages.
+ */
+ page_check_dirty_writeback(page, &dirty, &writeback);
+ if (dirty || writeback)
+ nr_dirty++;
+
+ if (dirty && !writeback)
+ nr_unqueued_dirty++;
+
+ /*
* If a page at the tail of the LRU is under writeback, there
* are three cases to consider.
*
@@ -841,11 +876,6 @@ static unsigned long shrink_page_list(struct list_head *page_list,
}
if (PageDirty(page)) {
- nr_dirty++;
-
- if (!PageWriteback(page))
- nr_unqueued_dirty++;
-
/*
* Only kswapd can writeback filesystem pages to
* avoid risk of stack overflow but only writeback
@@ -1318,7 +1348,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
unsigned long nr_scanned;
unsigned long nr_reclaimed = 0;
unsigned long nr_taken;
- unsigned long nr_dirty = 0;
+ unsigned long nr_unqueued_dirty = 0;
unsigned long nr_writeback = 0;
isolate_mode_t isolate_mode = 0;
int file = is_file_lru(lru);
@@ -1361,7 +1391,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
return 0;
nr_reclaimed = shrink_page_list(&page_list, zone, sc, TTU_UNMAP,
- &nr_dirty, &nr_writeback, false);
+ &nr_unqueued_dirty, &nr_writeback, false);
spin_lock_irq(&zone->lru_lock);
@@ -1416,11 +1446,13 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
/*
* Similarly, if many dirty pages are encountered that are not
* currently being written then flag that kswapd should start
- * writing back pages.
+ * writing back pages and stall to give a chance for flushers
+ * to catch up.
*/
- if (global_reclaim(sc) && nr_dirty &&
- nr_dirty >= (nr_taken >> (DEF_PRIORITY - sc->priority)))
+ if (global_reclaim(sc) && nr_unqueued_dirty == nr_taken) {
+ congestion_wait(BLK_RW_ASYNC, HZ/10);
zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY);
+ }
trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
zone_idx(zone),
--
1.8.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/4] mm: vmscan: Stall page reclaim after a list of pages have been processed
2013-05-27 13:02 [PATCH 0/2] Reduce system disruption due to kswapd followup Mel Gorman
2013-05-27 13:02 ` [PATCH 1/4] mm: vmscan: Block kswapd if it is encountering pages under writeback -fix Mel Gorman
2013-05-27 13:02 ` [PATCH 2/4] mm: vmscan: Stall page reclaim and writeback pages based on dirty/writepage pages encountered Mel Gorman
@ 2013-05-27 13:02 ` Mel Gorman
2013-05-27 13:02 ` [PATCH 4/4] mm: vmscan: Take page buffers dirty and locked state into account Mel Gorman
3 siblings, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2013-05-27 13:02 UTC (permalink / raw)
To: Andrew Morton
Cc: Jiri Slaby, Valdis Kletnieks, Rik van Riel, Zlatko Calusic,
Johannes Weiner, dormando, Michal Hocko, Jan Kara, Dave Chinner,
Kamezawa Hiroyuki, Linux-FSDevel, Linux-MM, LKML, Mel Gorman
Commit "mm: vmscan: Block kswapd if it is encountering pages under
writeback" blocks page reclaim if it encounters pages under writeback
marked for immediate reclaim. It blocks while pages are still isolated
from the LRU which is necessary. This patch defers the blocking until
after the isolated pages have been processed.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
mm/vmscan.c | 41 +++++++++++++++++++++++++----------------
1 file changed, 25 insertions(+), 16 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index be8e445..f576bcc 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -699,6 +699,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
enum ttu_flags ttu_flags,
unsigned long *ret_nr_unqueued_dirty,
unsigned long *ret_nr_writeback,
+ unsigned long *ret_nr_immediate,
bool force_reclaim)
{
LIST_HEAD(ret_pages);
@@ -709,6 +710,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
unsigned long nr_congested = 0;
unsigned long nr_reclaimed = 0;
unsigned long nr_writeback = 0;
+ unsigned long nr_immediate = 0;
cond_resched();
@@ -770,8 +772,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
* IO can complete. Waiting on the page itself risks an
* indefinite stall if it is impossible to writeback the
* page due to IO error or disconnected storage so instead
- * block for HZ/10 or until some IO completes then clear the
- * ZONE_WRITEBACK flag to recheck if the condition exists.
+ * note that the LRU is being scanned too quickly and the
+ * caller can stall after page list has been processed.
*
* 2) Global reclaim encounters a page, memcg encounters a
* page that is not marked for immediate reclaim or
@@ -801,10 +803,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
if (current_is_kswapd() &&
PageReclaim(page) &&
zone_is_reclaim_writeback(zone)) {
- unlock_page(page);
- congestion_wait(BLK_RW_ASYNC, HZ/10);
- zone_clear_flag(zone, ZONE_WRITEBACK);
- goto keep;
+ nr_immediate++;
+ goto keep_locked;
/* Case 2 above */
} else if (global_reclaim(sc) ||
@@ -1030,6 +1030,7 @@ keep:
mem_cgroup_uncharge_end();
*ret_nr_unqueued_dirty += nr_unqueued_dirty;
*ret_nr_writeback += nr_writeback;
+ *ret_nr_immediate += nr_immediate;
return nr_reclaimed;
}
@@ -1041,7 +1042,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
.priority = DEF_PRIORITY,
.may_unmap = 1,
};
- unsigned long ret, dummy1, dummy2;
+ unsigned long ret, dummy1, dummy2, dummy3;
struct page *page, *next;
LIST_HEAD(clean_pages);
@@ -1054,7 +1055,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
ret = shrink_page_list(&clean_pages, zone, &sc,
TTU_UNMAP|TTU_IGNORE_ACCESS,
- &dummy1, &dummy2, true);
+ &dummy1, &dummy2, &dummy3, true);
list_splice(&clean_pages, page_list);
__mod_zone_page_state(zone, NR_ISOLATED_FILE, -ret);
return ret;
@@ -1350,6 +1351,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
unsigned long nr_taken;
unsigned long nr_unqueued_dirty = 0;
unsigned long nr_writeback = 0;
+ unsigned long nr_immediate = 0;
isolate_mode_t isolate_mode = 0;
int file = is_file_lru(lru);
struct zone *zone = lruvec_zone(lruvec);
@@ -1391,7 +1393,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
return 0;
nr_reclaimed = shrink_page_list(&page_list, zone, sc, TTU_UNMAP,
- &nr_unqueued_dirty, &nr_writeback, false);
+ &nr_unqueued_dirty, &nr_writeback, &nr_immediate, false);
spin_lock_irq(&zone->lru_lock);
@@ -1444,14 +1446,21 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
}
/*
- * Similarly, if many dirty pages are encountered that are not
- * currently being written then flag that kswapd should start
- * writing back pages and stall to give a chance for flushers
- * to catch up.
+ * Similarly, if pages marked for immediate reclaim and under writeback
+ * are encountered it implies that pages are cycling through the LRU
+ * faster than they can be written. If dirty pages are encountered that
+ * are not queued for IO, it implies that flushers are not keeping up.
+ * In this case, be more aggressive about stalling and start writing
+ * pages from reclaim context if necessary.
*/
- if (global_reclaim(sc) && nr_unqueued_dirty == nr_taken) {
- congestion_wait(BLK_RW_ASYNC, HZ/10);
- zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY);
+ if (global_reclaim(sc)) {
+ if (nr_unqueued_dirty == nr_taken || nr_immediate) {
+ congestion_wait(BLK_RW_ASYNC, HZ/10);
+ zone_clear_flag(zone, ZONE_WRITEBACK);
+ }
+
+ if (nr_unqueued_dirty == nr_taken)
+ zone_set_flag(zone, ZONE_TAIL_LRU_DIRTY);
}
trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
--
1.8.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/4] mm: vmscan: Take page buffers dirty and locked state into account
2013-05-27 13:02 [PATCH 0/2] Reduce system disruption due to kswapd followup Mel Gorman
` (2 preceding siblings ...)
2013-05-27 13:02 ` [PATCH 3/4] mm: vmscan: Stall page reclaim after a list of pages have been processed Mel Gorman
@ 2013-05-27 13:02 ` Mel Gorman
2013-05-29 19:53 ` Andrew Morton
3 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2013-05-27 13:02 UTC (permalink / raw)
To: Andrew Morton
Cc: Jiri Slaby, Valdis Kletnieks, Rik van Riel, Zlatko Calusic,
Johannes Weiner, dormando, Michal Hocko, Jan Kara, Dave Chinner,
Kamezawa Hiroyuki, Linux-FSDevel, Linux-MM, LKML, Mel Gorman
Page reclaim keeps track of dirty and under writeback pages and uses it to
determine if wait_iff_congested() should stall or if kswapd should begin
writing back pages. This fails to account for buffer pages that can be under
writeback but not PageWriteback which is the case for filesystems like ext3
ordered mode. Furthermore, PageDirty buffer pages can have all the buffers
clean and writepage does no IO so it should not be accounted as congested.
This patch adds an address_space operation that filesystems may
optionally use to check if a page is really dirty or really under
writeback. An implementation is provided for for buffer_heads is added
and used for block operations and ext3 in ordered mode. By default the
page flags are obeyed.
Credit goes to Jan Kara for identifying that the page flags alone are
not sufficient for ext3 and sanity checking a number of ideas on how
the problem could be addressed.
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
fs/block_dev.c | 1 +
fs/buffer.c | 34 ++++++++++++++++++++++++++++++++++
fs/ext3/inode.c | 1 +
include/linux/buffer_head.h | 3 +++
include/linux/fs.h | 1 +
mm/vmscan.c | 8 ++++++++
6 files changed, 48 insertions(+)
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 2091db8..9c8ebe4 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1583,6 +1583,7 @@ static const struct address_space_operations def_blk_aops = {
.writepages = generic_writepages,
.releasepage = blkdev_releasepage,
.direct_IO = blkdev_direct_IO,
+ .is_dirty_writeback = buffer_check_dirty_writeback,
};
const struct file_operations def_blk_fops = {
diff --git a/fs/buffer.c b/fs/buffer.c
index 1aa0836..4247aa9 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -91,6 +91,40 @@ void unlock_buffer(struct buffer_head *bh)
EXPORT_SYMBOL(unlock_buffer);
/*
+ * Returns if the page has dirty or writeback buffers. If all the buffers
+ * are unlocked and clean then the PageDirty information is stale. If
+ * any of the pages are locked, it is assumed they are locked for IO.
+ */
+void buffer_check_dirty_writeback(struct page *page,
+ bool *dirty, bool *writeback)
+{
+ struct buffer_head *head, *bh;
+ *dirty = false;
+ *writeback = false;
+
+ BUG_ON(!PageLocked(page));
+
+ if (!page_has_buffers(page))
+ return;
+
+ if (PageWriteback(page))
+ *writeback = true;
+
+ head = page_buffers(page);
+ bh = head;
+ do {
+ if (buffer_locked(bh))
+ *writeback = true;
+
+ if (buffer_dirty(bh))
+ *dirty = true;
+
+ bh = bh->b_this_page;
+ } while (bh != head);
+}
+EXPORT_SYMBOL(buffer_check_dirty_writeback);
+
+/*
* Block until a buffer comes unlocked. This doesn't stop it
* from becoming locked again - you have to lock it yourself
* if you want to preserve its state.
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 23c7128..8e590bd 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1984,6 +1984,7 @@ static const struct address_space_operations ext3_ordered_aops = {
.direct_IO = ext3_direct_IO,
.migratepage = buffer_migrate_page,
.is_partially_uptodate = block_is_partially_uptodate,
+ .is_dirty_writeback = buffer_check_dirty_writeback,
.error_remove_page = generic_error_remove_page,
};
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index 6d9f5a2..d458880 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -139,6 +139,9 @@ BUFFER_FNS(Prio, prio)
})
#define page_has_buffers(page) PagePrivate(page)
+void buffer_check_dirty_writeback(struct page *page,
+ bool *dirty, bool *writeback);
+
/*
* Declarations
*/
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 0a9a6766..96f857f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -380,6 +380,7 @@ struct address_space_operations {
int (*launder_page) (struct page *);
int (*is_partially_uptodate) (struct page *, read_descriptor_t *,
unsigned long);
+ void (*is_dirty_writeback) (struct page *, bool *, bool *);
int (*error_remove_page)(struct address_space *, struct page *);
/* swapfile support */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index f576bcc..6237725 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -688,6 +688,14 @@ static void page_check_dirty_writeback(struct page *page,
/* By default assume that the page flags are accurate */
*dirty = PageDirty(page);
*writeback = PageWriteback(page);
+
+ /* Verify dirty/writeback state if the filesystem supports it */
+ if (!page_has_private(page))
+ return;
+
+ mapping = page_mapping(page);
+ if (mapping && mapping->a_ops->is_dirty_writeback)
+ mapping->a_ops->is_dirty_writeback(page, dirty, writeback);
}
/*
--
1.8.1.4
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 4/4] mm: vmscan: Take page buffers dirty and locked state into account
2013-05-27 13:02 ` [PATCH 4/4] mm: vmscan: Take page buffers dirty and locked state into account Mel Gorman
@ 2013-05-29 19:53 ` Andrew Morton
2013-05-29 22:28 ` Jan Kara
0 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2013-05-29 19:53 UTC (permalink / raw)
To: Mel Gorman
Cc: Jiri Slaby, Valdis Kletnieks, Rik van Riel, Zlatko Calusic,
Johannes Weiner, dormando, Michal Hocko, Jan Kara, Dave Chinner,
Kamezawa Hiroyuki, Linux-FSDevel, Linux-MM, LKML
On Mon, 27 May 2013 14:02:58 +0100 Mel Gorman <mgorman@suse.de> wrote:
> Page reclaim keeps track of dirty and under writeback pages and uses it to
> determine if wait_iff_congested() should stall or if kswapd should begin
> writing back pages. This fails to account for buffer pages that can be under
> writeback but not PageWriteback which is the case for filesystems like ext3
> ordered mode. Furthermore, PageDirty buffer pages can have all the buffers
> clean and writepage does no IO so it should not be accounted as congested.
iirc, the PageDirty-all-buffers-clean state is pretty rare. It might
not be worth bothering about?
> This patch adds an address_space operation that filesystems may
> optionally use to check if a page is really dirty or really under
> writeback.
address_space_operations methods are Documented in
Documentation/filesystems/vfs.txt ;)
> An implementation is provided for for buffer_heads is added
> and used for block operations and ext3 in ordered mode. By default the
> page flags are obeyed.
>
> Credit goes to Jan Kara for identifying that the page flags alone are
> not sufficient for ext3 and sanity checking a number of ideas on how
> the problem could be addressed.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 4/4] mm: vmscan: Take page buffers dirty and locked state into account
2013-05-29 19:53 ` Andrew Morton
@ 2013-05-29 22:28 ` Jan Kara
0 siblings, 0 replies; 7+ messages in thread
From: Jan Kara @ 2013-05-29 22:28 UTC (permalink / raw)
To: Andrew Morton
Cc: Mel Gorman, Jiri Slaby, Valdis Kletnieks, Rik van Riel,
Zlatko Calusic, Johannes Weiner, dormando, Michal Hocko, Jan Kara,
Dave Chinner, Kamezawa Hiroyuki, Linux-FSDevel, Linux-MM, LKML
On Wed 29-05-13 12:53:56, Andrew Morton wrote:
> On Mon, 27 May 2013 14:02:58 +0100 Mel Gorman <mgorman@suse.de> wrote:
>
> > Page reclaim keeps track of dirty and under writeback pages and uses it to
> > determine if wait_iff_congested() should stall or if kswapd should begin
> > writing back pages. This fails to account for buffer pages that can be under
> > writeback but not PageWriteback which is the case for filesystems like ext3
> > ordered mode. Furthermore, PageDirty buffer pages can have all the buffers
> > clean and writepage does no IO so it should not be accounted as congested.
>
> iirc, the PageDirty-all-buffers-clean state is pretty rare. It might
> not be worth bothering about?
Not true for ext3 in data=ordered mode. In some workloads, kjournald ends
up writing most of the data during journal commit and that exactly leaves
dirty pages with clean buffers. So in such setup lots of dirty pages can be
of that strange kind...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread