* [PATCH] [RFC]Dirty page accounting on lru basis.
@ 2010-09-03 4:43 Ying Han
2010-09-03 7:50 ` KOSAKI Motohiro
0 siblings, 1 reply; 2+ messages in thread
From: Ying Han @ 2010-09-03 4:43 UTC (permalink / raw)
To: riel, minchan.kim, hugh.dickins, kamezawa.hiroyu, fengguang.wu,
mel, npiggin, akpm, linux-mm
For each active, inactive and unevictable lru list, we would like to count the
number of dirty file pages. This becomes useful when we start monitoring and
tracking the efficiency of page reclaim path while doing some heavy IO workloads.
We export the new accounting now through global proc/meminfo as well as per-node
meminfo. Ideally, the accounting should work as:
Dirty = ActiveDirty(file) + InactiveDirty(file) + Unevict_Dirty(file)
Example output:
$ ddtest -D /export/hda3/dd -b 1024 -n 1048576 -t 5 &
$ cat /proc/meminfo
ActiveDirty(file): 4044 kB
InactiveDirty(file): 8800 kB
Unevict_Dirty(file): 0 kB
Dirty: 12844 kB
$ cat /sys/devices/system/node/node0/meminfo
Node 0 Active_Dirty(file): 656 kB
Node 0 Inactive_Dirty(file): 6336 kB
Node 0 Unevict_Dirty(file): 0 kB
Node 0 Dirty: 6992 kB
The current patch doesn't do the work perfectly. Over certain period of time,
I observed a few pages difference on the two counters(total lru dirty vs dirty).
That is because the page can go from dirty->clean, and clean->dirty while on lru.
There is no lock I can grab to prevent that. A race would happen like:
1. account_page_dirtied
if page on active
-> __inc_zone_page_state_dirty on active
2. isolate_lru_pages from active
if page is dirty
-> dec_zone_page_state_dirty on active
Page can become dirty to clean between 1 and 2. So we end up have 1 more page count
on ActiveDirty.
At this moment, I would like to collect feedbacks from upstream if there is a
feasible way of solving the race condition here.
Signed-off-by: Ying Han <yinghan@google.com>
---
drivers/base/node.c | 64 ++++++++++++++++++---------------
fs/proc/meminfo.c | 86 ++++++++++++++++++++++++---------------------
include/linux/mm.h | 1 +
include/linux/mm_inline.h | 35 ++++++++++++++++++
include/linux/mmzone.h | 3 ++
include/linux/vmstat.h | 15 ++++++++
mm/filemap.c | 2 +
mm/page-writeback.c | 6 +++
mm/page_alloc.c | 6 +++
mm/truncate.c | 2 +
mm/vmscan.c | 28 ++++++++++++++-
mm/vmstat.c | 41 +++++++++++++++++++++
12 files changed, 219 insertions(+), 70 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 2872e86..de5f198 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -67,17 +67,20 @@ static ssize_t node_read_meminfo(struct sys_device * dev,
si_meminfo_node(&i, nid);
n = sprintf(buf,
- "Node %d MemTotal: %8lu kB\n"
- "Node %d MemFree: %8lu kB\n"
- "Node %d MemUsed: %8lu kB\n"
- "Node %d Active: %8lu kB\n"
- "Node %d Inactive: %8lu kB\n"
- "Node %d Active(anon): %8lu kB\n"
- "Node %d Inactive(anon): %8lu kB\n"
- "Node %d Active(file): %8lu kB\n"
- "Node %d Inactive(file): %8lu kB\n"
- "Node %d Unevictable: %8lu kB\n"
- "Node %d Mlocked: %8lu kB\n",
+ "Node %d MemTotal: %8lu kB\n"
+ "Node %d MemFree: %8lu kB\n"
+ "Node %d MemUsed: %8lu kB\n"
+ "Node %d Active: %8lu kB\n"
+ "Node %d Inactive: %8lu kB\n"
+ "Node %d Active(anon): %8lu kB\n"
+ "Node %d Inactive(anon): %8lu kB\n"
+ "Node %d Active(file): %8lu kB\n"
+ "Node %d Inactive(file): %8lu kB\n"
+ "Node %d Unevictable: %8lu kB\n"
+ "Node %d Active_Dirty(file): %8lu kB\n"
+ "Node %d Inactive_Dirty(file): %8lu kB\n"
+ "Node %d Unevict_Dirty(file): %8lu kB\n"
+ "Node %d Mlocked: %8lu kB\n",
nid, K(i.totalram),
nid, K(i.freeram),
nid, K(i.totalram - i.freeram),
@@ -90,34 +93,37 @@ static ssize_t node_read_meminfo(struct sys_device * dev,
nid, K(node_page_state(nid, NR_ACTIVE_FILE)),
nid, K(node_page_state(nid, NR_INACTIVE_FILE)),
nid, K(node_page_state(nid, NR_UNEVICTABLE)),
+ nid, K(node_page_state(nid, NR_ACTIVE_DIRTY)),
+ nid, K(node_page_state(nid, NR_INACTIVE_DIRTY)),
+ nid, K(node_page_state(nid, NR_UNEVICTABLE_DIRTY)),
nid, K(node_page_state(nid, NR_MLOCK)));
#ifdef CONFIG_HIGHMEM
n += sprintf(buf + n,
- "Node %d HighTotal: %8lu kB\n"
- "Node %d HighFree: %8lu kB\n"
- "Node %d LowTotal: %8lu kB\n"
- "Node %d LowFree: %8lu kB\n",
+ "Node %d HighTotal: %8lu kB\n"
+ "Node %d HighFree: %8lu kB\n"
+ "Node %d LowTotal: %8lu kB\n"
+ "Node %d LowFree: %8lu kB\n",
nid, K(i.totalhigh),
nid, K(i.freehigh),
nid, K(i.totalram - i.totalhigh),
nid, K(i.freeram - i.freehigh));
#endif
n += sprintf(buf + n,
- "Node %d Dirty: %8lu kB\n"
- "Node %d Writeback: %8lu kB\n"
- "Node %d FilePages: %8lu kB\n"
- "Node %d Mapped: %8lu kB\n"
- "Node %d AnonPages: %8lu kB\n"
- "Node %d Shmem: %8lu kB\n"
- "Node %d KernelStack: %8lu kB\n"
- "Node %d PageTables: %8lu kB\n"
- "Node %d NFS_Unstable: %8lu kB\n"
- "Node %d Bounce: %8lu kB\n"
- "Node %d WritebackTmp: %8lu kB\n"
- "Node %d Slab: %8lu kB\n"
- "Node %d SReclaimable: %8lu kB\n"
- "Node %d SUnreclaim: %8lu kB\n",
+ "Node %d Dirty: %8lu kB\n"
+ "Node %d Writeback: %8lu kB\n"
+ "Node %d FilePages: %8lu kB\n"
+ "Node %d Mapped: %8lu kB\n"
+ "Node %d AnonPages: %8lu kB\n"
+ "Node %d Shmem: %8lu kB\n"
+ "Node %d KernelStack: %8lu kB\n"
+ "Node %d PageTables: %8lu kB\n"
+ "Node %d NFS_Unstable: %8lu kB\n"
+ "Node %d Bounce: %8lu kB\n"
+ "Node %d WritebackTmp: %8lu kB\n"
+ "Node %d Slab: %8lu kB\n"
+ "Node %d SReclaimable: %8lu kB\n"
+ "Node %d SUnreclaim: %8lu kB\n",
nid, K(node_page_state(nid, NR_FILE_DIRTY)),
nid, K(node_page_state(nid, NR_WRITEBACK)),
nid, K(node_page_state(nid, NR_FILE_PAGES)),
diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
index a65239c..ac2a664 100644
--- a/fs/proc/meminfo.c
+++ b/fs/proc/meminfo.c
@@ -53,53 +53,56 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
* Tagged format, for easy grepping and expansion.
*/
seq_printf(m,
- "MemTotal: %8lu kB\n"
- "MemFree: %8lu kB\n"
- "Buffers: %8lu kB\n"
- "Cached: %8lu kB\n"
- "SwapCached: %8lu kB\n"
- "Active: %8lu kB\n"
- "Inactive: %8lu kB\n"
- "Active(anon): %8lu kB\n"
- "Inactive(anon): %8lu kB\n"
- "Active(file): %8lu kB\n"
- "Inactive(file): %8lu kB\n"
- "Unevictable: %8lu kB\n"
- "Mlocked: %8lu kB\n"
+ "MemTotal: %8lu kB\n"
+ "MemFree: %8lu kB\n"
+ "Buffers: %8lu kB\n"
+ "Cached: %8lu kB\n"
+ "SwapCached: %8lu kB\n"
+ "Active: %8lu kB\n"
+ "Inactive: %8lu kB\n"
+ "Active(anon): %8lu kB\n"
+ "Inactive(anon): %8lu kB\n"
+ "Active(file): %8lu kB\n"
+ "Inactive(file): %8lu kB\n"
+ "Unevictable: %8lu kB\n"
+ "ActiveDirty(file): %8lu kB\n"
+ "InactiveDirty(file): %8lu kB\n"
+ "Unevict_Dirty(file): %8lu kB\n"
+ "Mlocked: %8lu kB\n"
#ifdef CONFIG_HIGHMEM
- "HighTotal: %8lu kB\n"
- "HighFree: %8lu kB\n"
- "LowTotal: %8lu kB\n"
- "LowFree: %8lu kB\n"
+ "HighTotal: %8lu kB\n"
+ "HighFree: %8lu kB\n"
+ "LowTotal: %8lu kB\n"
+ "LowFree: %8lu kB\n"
#endif
#ifndef CONFIG_MMU
- "MmapCopy: %8lu kB\n"
+ "MmapCopy: %8lu kB\n"
#endif
- "SwapTotal: %8lu kB\n"
- "SwapFree: %8lu kB\n"
- "Dirty: %8lu kB\n"
- "Writeback: %8lu kB\n"
- "AnonPages: %8lu kB\n"
- "Mapped: %8lu kB\n"
- "Shmem: %8lu kB\n"
- "Slab: %8lu kB\n"
- "SReclaimable: %8lu kB\n"
- "SUnreclaim: %8lu kB\n"
- "KernelStack: %8lu kB\n"
- "PageTables: %8lu kB\n"
+ "SwapTotal: %8lu kB\n"
+ "SwapFree: %8lu kB\n"
+ "Dirty: %8lu kB\n"
+ "Writeback: %8lu kB\n"
+ "AnonPages: %8lu kB\n"
+ "Mapped: %8lu kB\n"
+ "Shmem: %8lu kB\n"
+ "Slab: %8lu kB\n"
+ "SReclaimable: %8lu kB\n"
+ "SUnreclaim: %8lu kB\n"
+ "KernelStack: %8lu kB\n"
+ "PageTables: %8lu kB\n"
#ifdef CONFIG_QUICKLIST
- "Quicklists: %8lu kB\n"
+ "Quicklists: %8lu kB\n"
#endif
- "NFS_Unstable: %8lu kB\n"
- "Bounce: %8lu kB\n"
- "WritebackTmp: %8lu kB\n"
- "CommitLimit: %8lu kB\n"
- "Committed_AS: %8lu kB\n"
- "VmallocTotal: %8lu kB\n"
- "VmallocUsed: %8lu kB\n"
- "VmallocChunk: %8lu kB\n"
+ "NFS_Unstable: %8lu kB\n"
+ "Bounce: %8lu kB\n"
+ "WritebackTmp: %8lu kB\n"
+ "CommitLimit: %8lu kB\n"
+ "Committed_AS: %8lu kB\n"
+ "VmallocTotal: %8lu kB\n"
+ "VmallocUsed: %8lu kB\n"
+ "VmallocChunk: %8lu kB\n"
#ifdef CONFIG_MEMORY_FAILURE
- "HardwareCorrupted: %5lu kB\n"
+ "HardwareCorrupted: %5lu kB\n"
#endif
,
K(i.totalram),
@@ -114,6 +117,9 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
K(pages[LRU_ACTIVE_FILE]),
K(pages[LRU_INACTIVE_FILE]),
K(pages[LRU_UNEVICTABLE]),
+ K(global_page_state(NR_ACTIVE_DIRTY)),
+ K(global_page_state(NR_INACTIVE_DIRTY)),
+ K(global_page_state(NR_UNEVICTABLE_DIRTY)),
K(global_page_state(NR_MLOCK)),
#ifdef CONFIG_HIGHMEM
K(i.totalhigh),
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e6b1210..9ae8a1a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -14,6 +14,7 @@
#include <linux/mm_types.h>
#include <linux/range.h>
#include <linux/pfn.h>
+#include <linux/backing-dev.h>
struct mempolicy;
struct anon_vma;
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 8835b87..754cecb 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -22,17 +22,41 @@ static inline int page_is_file_cache(struct page *page)
static inline void
add_page_to_lru_list(struct zone *zone, struct page *page, enum lru_list l)
{
+ struct address_space *mapping = page_mapping(page);
+
list_add(&page->lru, &zone->lru[l].list);
__inc_zone_state(zone, NR_LRU_BASE + l);
mem_cgroup_add_lru_list(page, l);
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping)) {
+ if (is_active_lru(l))
+ __inc_zone_state(zone, NR_ACTIVE_DIRTY);
+ else if (is_unevictable_lru(l))
+ __inc_zone_state(zone, NR_UNEVICTABLE_DIRTY);
+ else
+ __inc_zone_state(zone, NR_INACTIVE_DIRTY);
+ }
+
}
static inline void
del_page_from_lru_list(struct zone *zone, struct page *page, enum lru_list l)
{
+ struct address_space *mapping = page_mapping(page);
+
list_del(&page->lru);
__dec_zone_state(zone, NR_LRU_BASE + l);
mem_cgroup_del_lru_list(page, l);
+
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping)) {
+ if (is_active_lru(l))
+ __dec_zone_state(zone, NR_ACTIVE_DIRTY);
+ else if (is_unevictable_lru(l))
+ __dec_zone_state(zone, NR_UNEVICTABLE_DIRTY);
+ else
+ __dec_zone_state(zone, NR_INACTIVE_DIRTY);
+ }
}
/**
@@ -54,17 +78,28 @@ static inline void
del_page_from_lru(struct zone *zone, struct page *page)
{
enum lru_list l;
+ struct address_space *mapping = page_mapping(page);
list_del(&page->lru);
if (PageUnevictable(page)) {
__ClearPageUnevictable(page);
l = LRU_UNEVICTABLE;
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping))
+ __dec_zone_state(zone, NR_UNEVICTABLE_DIRTY);
} else {
l = page_lru_base_type(page);
if (PageActive(page)) {
__ClearPageActive(page);
l += LRU_ACTIVE;
}
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping)) {
+ if (is_active_lru(l))
+ __dec_zone_state(zone, NR_ACTIVE_DIRTY);
+ else
+ __dec_zone_state(zone, NR_INACTIVE_DIRTY);
+ }
}
__dec_zone_state(zone, NR_LRU_BASE + l);
mem_cgroup_del_lru_list(page, l);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6e6e626..033d1f9 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -85,6 +85,9 @@ enum zone_stat_item {
NR_INACTIVE_FILE, /* " " " " " */
NR_ACTIVE_FILE, /* " " " " " */
NR_UNEVICTABLE, /* " " " " " */
+ NR_INACTIVE_DIRTY,
+ NR_ACTIVE_DIRTY,
+ NR_UNEVICTABLE_DIRTY,
NR_MLOCK, /* mlock()ed pages found and moved off LRU */
NR_ANON_PAGES, /* Mapped anonymous pages */
NR_FILE_MAPPED, /* pagecache pages mapped into pagetables.
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 7f43ccd..77e5f4f 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -218,12 +218,15 @@ static inline void zap_zone_vm_stats(struct zone *zone)
extern void inc_zone_state(struct zone *, enum zone_stat_item);
#ifdef CONFIG_SMP
+void __mod_zone_page_state_dirty(struct zone *, enum zone_stat_item item, int);
void __mod_zone_page_state(struct zone *, enum zone_stat_item item, int);
+void __inc_zone_page_state_dirty(struct page *);
void __inc_zone_page_state(struct page *, enum zone_stat_item);
void __dec_zone_page_state(struct page *, enum zone_stat_item);
void mod_zone_page_state(struct zone *, enum zone_stat_item, int);
void inc_zone_page_state(struct page *, enum zone_stat_item);
+void dec_zone_page_state_dirty(struct page *);
void dec_zone_page_state(struct page *, enum zone_stat_item);
extern void inc_zone_state(struct zone *, enum zone_stat_item);
@@ -238,6 +241,17 @@ void refresh_cpu_vm_stats(int);
* We do not maintain differentials in a single processor configuration.
* The functions directly modify the zone and global counters.
*/
+static inline void __mod_zone_page_state_dirty(struct zone *zone,
+ enum zone_stat_item item, int delta)
+{
+ if (is_active_lru(item))
+ zone_page_state_add(delta, zone, NR_ACTIVE_DIRTY);
+ else if (is_unevictable_lru(item))
+ zone_page_state_add(delta, zone, NR_UNEVICTABLE_DIRTY);
+ else
+ zone_page_state_add(delta, zone, NR_INACTIVE_DIRTY);
+}
+
static inline void __mod_zone_page_state(struct zone *zone,
enum zone_stat_item item, int delta)
{
@@ -275,6 +289,7 @@ static inline void __dec_zone_page_state(struct page *page,
#define inc_zone_page_state __inc_zone_page_state
#define dec_zone_page_state __dec_zone_page_state
#define mod_zone_page_state __mod_zone_page_state
+#define dec_zone_page_state_dirty __dec_zone_page_state_dirty
static inline void refresh_cpu_vm_stats(int cpu) { }
#endif
diff --git a/mm/filemap.c b/mm/filemap.c
index 3d4df44..597aca0 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -137,6 +137,8 @@ void __remove_from_page_cache(struct page *page)
if (PageDirty(page) && mapping_cap_account_dirty(mapping)) {
dec_zone_page_state(page, NR_FILE_DIRTY);
dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
+ if (PageLRU(page))
+ dec_zone_page_state_dirty(page);
}
}
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index e3bccac..c65916d 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -1122,7 +1122,10 @@ void account_page_dirtied(struct page *page, struct address_space *mapping)
if (mapping_cap_account_dirty(mapping)) {
__inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
+ if (PageLRU(page))
+ __inc_zone_page_state_dirty(page);
task_dirty_inc(current);
+
task_io_account_write(PAGE_CACHE_SIZE);
}
}
@@ -1299,6 +1302,9 @@ int clear_page_dirty_for_io(struct page *page)
dec_zone_page_state(page, NR_FILE_DIRTY);
dec_bdi_stat(mapping->backing_dev_info,
BDI_RECLAIMABLE);
+ if (PageLRU(page))
+ dec_zone_page_state_dirty(page);
+
return 1;
}
return 0;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a9649f4..1f26a3e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2404,6 +2404,9 @@ void show_free_areas(void)
" active_file:%lukB"
" inactive_file:%lukB"
" unevictable:%lukB"
+ " active_dirty:%lukB"
+ " inactive_dirty:%lukB"
+ " unevictable_dirty:%lukB"
" isolated(anon):%lukB"
" isolated(file):%lukB"
" present:%lukB"
@@ -2432,6 +2435,9 @@ void show_free_areas(void)
K(zone_page_state(zone, NR_ACTIVE_FILE)),
K(zone_page_state(zone, NR_INACTIVE_FILE)),
K(zone_page_state(zone, NR_UNEVICTABLE)),
+ K(zone_page_state(zone, NR_ACTIVE_DIRTY)),
+ K(zone_page_state(zone, NR_INACTIVE_DIRTY)),
+ K(zone_page_state(zone, NR_UNEVICTABLE_DIRTY)),
K(zone_page_state(zone, NR_ISOLATED_ANON)),
K(zone_page_state(zone, NR_ISOLATED_FILE)),
K(zone->present_pages),
diff --git a/mm/truncate.c b/mm/truncate.c
index ba887bf..1659420 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -77,6 +77,8 @@ void cancel_dirty_page(struct page *page, unsigned int account_size)
dec_zone_page_state(page, NR_FILE_DIRTY);
dec_bdi_stat(mapping->backing_dev_info,
BDI_RECLAIMABLE);
+ if (PageLRU(page))
+ dec_zone_page_state_dirty(page);
if (account_size)
task_io_account_cancelled_write(account_size);
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c391c32..b67f785 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -952,8 +952,10 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
unsigned long end_pfn;
unsigned long page_pfn;
int zone_id;
+ struct address_space *mapping;
page = lru_to_page(src);
+ mapping = page_mapping(page);
prefetchw_prev_lru_page(page, src, flags);
VM_BUG_ON(!PageLRU(page));
@@ -963,6 +965,9 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
list_move(&page->lru, dst);
mem_cgroup_del_lru(page);
nr_taken++;
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping))
+ dec_zone_page_state_dirty(page);
break;
case -EBUSY:
@@ -993,6 +998,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
end_pfn = pfn + (1 << order);
for (; pfn < end_pfn; pfn++) {
struct page *cursor_page;
+ struct address_space *mapping;
/* The target page is in the block, ignore it. */
if (unlikely(pfn == page_pfn))
@@ -1003,6 +1009,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
break;
cursor_page = pfn_to_page(pfn);
+ mapping = page_mapping(cursor_page);
/* Check that we have not crossed a zone boundary. */
if (unlikely(page_zone_id(cursor_page) != zone_id))
@@ -1024,6 +1031,10 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
nr_lumpy_taken++;
if (PageDirty(cursor_page))
nr_lumpy_dirty++;
+
+ if (PageDirty(cursor_page) && mapping &&
+ mapping_cap_account_dirty(mapping))
+ dec_zone_page_state_dirty(cursor_page);
scan++;
} else {
if (mode == ISOLATE_BOTH &&
@@ -1149,7 +1160,6 @@ static int too_many_isolated(struct zone *zone, int file,
inactive = zone_page_state(zone, NR_INACTIVE_ANON);
isolated = zone_page_state(zone, NR_ISOLATED_ANON);
}
-
return isolated > inactive;
}
@@ -1385,19 +1395,25 @@ static void move_active_pages_to_lru(struct zone *zone,
enum lru_list lru)
{
unsigned long pgmoved = 0;
+ unsigned long pgdirty = 0;
struct pagevec pvec;
struct page *page;
+ struct address_space *mapping;
pagevec_init(&pvec, 1);
while (!list_empty(list)) {
page = lru_to_page(list);
+ mapping = page_mapping(page);
VM_BUG_ON(PageLRU(page));
SetPageLRU(page);
list_move(&page->lru, &zone->lru[lru].list);
mem_cgroup_add_lru_list(page, lru);
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping))
+ pgdirty++;
pgmoved++;
if (!pagevec_add(&pvec, page) || list_empty(list)) {
@@ -1409,6 +1425,8 @@ static void move_active_pages_to_lru(struct zone *zone,
}
}
__mod_zone_page_state(zone, NR_LRU_BASE + lru, pgmoved);
+ __mod_zone_page_state_dirty(zone, lru, pgdirty);
+
if (!is_active_lru(lru))
__count_vm_events(PGDEACTIVATE, pgmoved);
}
@@ -1774,6 +1792,7 @@ static void shrink_zone(int priority, struct zone *zone,
*/
if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY)
break;
+
}
sc->nr_reclaimed = nr_reclaimed;
@@ -2800,6 +2819,7 @@ int page_evictable(struct page *page, struct vm_area_struct *vma)
*/
static void check_move_unevictable_page(struct page *page, struct zone *zone)
{
+ struct address_space *mapping = page_mapping(page);
VM_BUG_ON(PageActive(page));
retry:
@@ -2812,6 +2832,12 @@ retry:
mem_cgroup_move_lists(page, LRU_UNEVICTABLE, l);
__inc_zone_state(zone, NR_INACTIVE_ANON + l);
__count_vm_event(UNEVICTABLE_PGRESCUED);
+ if (PageDirty(page) && mapping &&
+ mapping_cap_account_dirty(mapping)) {
+ __dec_zone_state(zone, NR_UNEVICTABLE_DIRTY);
+ __inc_zone_state(zone, NR_INACTIVE_DIRTY);
+ }
+
} else {
/*
* rotate unevictable list
diff --git a/mm/vmstat.c b/mm/vmstat.c
index f389168..ee738b7 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -146,6 +146,18 @@ static void refresh_zone_stat_thresholds(void)
}
}
+void __mod_zone_page_state_dirty(struct zone *zone,
+ enum zone_stat_item item, int delta)
+{
+ if (is_active_lru(item))
+ __mod_zone_page_state(zone, NR_ACTIVE_DIRTY, delta);
+ else if (is_unevictable_lru(item))
+ __mod_zone_page_state(zone, NR_UNEVICTABLE_DIRTY, delta);
+ else
+ __mod_zone_page_state(zone, NR_INACTIVE_DIRTY, delta);
+}
+EXPORT_SYMBOL(__mod_zone_page_state_dirty);
+
/*
* For use when we know that interrupts are disabled.
*/
@@ -219,6 +231,17 @@ void __inc_zone_state(struct zone *zone, enum zone_stat_item item)
}
}
+void __inc_zone_page_state_dirty(struct page *page)
+{
+ if (PageActive(page))
+ __inc_zone_page_state(page, NR_ACTIVE_DIRTY);
+ else if (PageUnevictable(page))
+ __inc_zone_page_state(page, NR_UNEVICTABLE_DIRTY);
+ else
+ __inc_zone_page_state(page, NR_INACTIVE_DIRTY);
+}
+EXPORT_SYMBOL(__inc_zone_page_state_dirty);
+
void __inc_zone_page_state(struct page *page, enum zone_stat_item item)
{
__inc_zone_state(page_zone(page), item);
@@ -267,6 +290,21 @@ void inc_zone_page_state(struct page *page, enum zone_stat_item item)
}
EXPORT_SYMBOL(inc_zone_page_state);
+void dec_zone_page_state_dirty(struct page *page)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ if (PageActive(page))
+ __dec_zone_page_state(page, NR_ACTIVE_DIRTY);
+ else if (PageUnevictable(page))
+ __dec_zone_page_state(page, NR_UNEVICTABLE_DIRTY);
+ else
+ __dec_zone_page_state(page, NR_INACTIVE_DIRTY);
+ local_irq_restore(flags);
+}
+EXPORT_SYMBOL(dec_zone_page_state_dirty);
+
void dec_zone_page_state(struct page *page, enum zone_stat_item item)
{
unsigned long flags;
@@ -715,6 +753,9 @@ static const char * const vmstat_text[] = {
"nr_inactive_file",
"nr_active_file",
"nr_unevictable",
+ "nr_inactive_dirty",
+ "nr_active_dirty",
+ "nr_unevictable_dirty",
"nr_mlock",
"nr_anon_pages",
"nr_mapped",
--
1.7.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] [RFC]Dirty page accounting on lru basis.
2010-09-03 4:43 [PATCH] [RFC]Dirty page accounting on lru basis Ying Han
@ 2010-09-03 7:50 ` KOSAKI Motohiro
0 siblings, 0 replies; 2+ messages in thread
From: KOSAKI Motohiro @ 2010-09-03 7:50 UTC (permalink / raw)
To: Ying Han
Cc: kosaki.motohiro, riel, minchan.kim, hugh.dickins, kamezawa.hiroyu,
fengguang.wu, mel, npiggin, akpm, linux-mm
> For each active, inactive and unevictable lru list, we would like to count the
> number of dirty file pages. This becomes useful when we start monitoring and
> tracking the efficiency of page reclaim path while doing some heavy IO workloads.
>
> We export the new accounting now through global proc/meminfo as well as per-node
> meminfo. Ideally, the accounting should work as:
linux/Documentation/vm/page-types.c ?
% sudo ~/bin/page-types
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000000000000 76660 299 __________________________________
0x0000000000000024 33920 132 __R__l____________________________ referenced,lru
0x0000000000000028 23774 92 ___U_l____________________________ uptodate,lru
0x0001000000000028 378 1 ___U_l________________________I___ uptodate,lru,readahead
0x000000000000002c 160474 626 __RU_l____________________________ referenced,uptodate,lru
0x0000000000004038 21 0 ___UDl________b___________________ uptodate,dirty,lru,swapbacked
0x000000000000003c 124491 486 __RUDl____________________________ referenced,uptodate,dirty,lru
0x000000000000403c 7 0 __RUDl________b___________________ referenced,uptodate,dirty,lru,swapbacked
0x0000000000000060 3521 13 _____lA___________________________ lru,active
0x0000000000000064 12681 49 __R__lA___________________________ referenced,lru,active
0x0000000000000068 5309 20 ___U_lA___________________________ uptodate,lru,active
0x000000000000006c 22840 89 __RU_lA___________________________ referenced,uptodate,lru,active
0x0000000000000074 60 0 __R_DlA___________________________ referenced,dirty,lru,active
0x000000000000007c 3 0 __RUDlA___________________________ referenced,uptodate,dirty,lru,active
0x0000000000000080 33810 132 _______S__________________________ slab
0x0004000000000080 179 0 _______S________________________A_ slab,slub_frozen
0x000000000000012c 7463 29 __RU_l__W_________________________ referenced,uptodate,lru,writeback
0x000000000000012d 2 0 L_RU_l__W_________________________ locked,referenced,uptodate,lru,writeback
0x0000000000000400 683 2 __________B_______________________ buddy
0x0000000000000800 16 0 ___________M______________________ mmap
0x0000000000000804 1 0 __R________M______________________ referenced,mmap
0x0000000000000828 42 0 ___U_l_____M______________________ uptodate,lru,mmap
0x000000000000082c 959 3 __RU_l_____M______________________ referenced,uptodate,lru,mmap
0x0000000000004838 4 0 ___UDl_____M__b___________________ uptodate,dirty,lru,mmap,swapbacked
0x0000000000000868 158 0 ___U_lA____M______________________ uptodate,lru,active,mmap
0x000000000000086c 3091 12 __RU_lA____M______________________ referenced,uptodate,lru,active,mmap
0x0000000000005808 2 0 ___U_______Ma_b___________________ uptodate,mmap,anonymous,swapbacked
0x0000000000005828 2702 10 ___U_l_____Ma_b___________________ uptodate,lru,mmap,anonymous,swapbacked
0x000000000000582c 22 0 __RU_l_____Ma_b___________________ referenced,uptodate,lru,mmap,anonymous,swapbacked
0x0000000000005868 8342 32 ___U_lA____Ma_b___________________ uptodate,lru,active,mmap,anonymous,swapbacked
0x000000000000586c 17 0 __RU_lA____Ma_b___________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
total 521632 2037
That said,
ActiveDirty(file): 240KB
InactiveDirty(file): 486MB
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-09-03 7:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-03 4:43 [PATCH] [RFC]Dirty page accounting on lru basis Ying Han
2010-09-03 7:50 ` KOSAKI Motohiro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).