linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
	David Rientjes <rientjes@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 03/12] Export unusable free space index via /proc/pagetypeinfo
Date: Fri, 12 Feb 2010 12:00:50 +0000	[thread overview]
Message-ID: <1265976059-7459-4-git-send-email-mel@csn.ul.ie> (raw)
In-Reply-To: <1265976059-7459-1-git-send-email-mel@csn.ul.ie>

Unusuable free space index is a measure of external fragmentation that
takes the allocation size into account. For the most part, the huge page
size will be the size of interest but not necessarily so it is exported
on a per-order and per-zone basis via /proc/pagetypeinfo.

The index is normally calculated as a value between 0 and 1 which is
obviously unsuitable within the kernel. Instead, the first three decimal
places are used as a value between 0 and 1000 for an integer approximation.

Signed-off-by: Mel Gorman <mel@csn.ul.ie>
---
 Documentation/filesystems/proc.txt |   10 ++++
 mm/vmstat.c                        |   99 ++++++++++++++++++++++++++++++++++++
 2 files changed, 109 insertions(+), 0 deletions(-)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 1829dfb..0968a81 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -614,6 +614,10 @@ Node    0, zone    DMA32, type      Movable    169    152    113     91     77
 Node    0, zone    DMA32, type      Reserve      1      2      2      2      2      0      1      1      1      1      0
 Node    0, zone    DMA32, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
 
+Unusable free space index at order
+Node    0, zone      DMA                         0      0      0      2      6     18     34     67     99    227    485
+Node    0, zone    DMA32                         0      0      1      2      4      7     10     17     23     31     34
+
 Number of blocks type     Unmovable  Reclaimable      Movable      Reserve      Isolate
 Node 0, zone      DMA            2            0            5            1            0
 Node 0, zone    DMA32           41            6          967            2            0
@@ -629,6 +633,12 @@ then gives the same type of information as buddyinfo except broken down
 by migrate-type and finishes with details on how many page blocks of each
 type exist.
 
+The unusable free space index measures how much of the available free
+memory cannot be used to satisfy an allocation of a given size and is a
+value between 0 and 1000. The higher the value, the more of free memory is
+unusable and by implication, the worse the external fragmentation is. The
+percentage of unusable free memory can be found by dividing this value by 10.
+
 If min_free_kbytes has been tuned correctly (recommendations made by hugeadm
 from libhugetlbfs http://sourceforge.net/projects/libhugetlbfs/), one can
 make an estimate of the likely number of huge pages that can be allocated
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 6051fba..d05d610 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -451,6 +451,104 @@ static int frag_show(struct seq_file *m, void *arg)
 	return 0;
 }
 
+
+struct contig_page_info {
+	unsigned long free_pages;
+	unsigned long free_blocks_total;
+	unsigned long free_blocks_suitable;
+};
+
+/*
+ * Calculate the number of free pages in a zone, how many contiguous
+ * pages are free and how many are large enough to satisfy an allocation of
+ * the target size. Note that this function makes to attempt to estimate
+ * how many suitable free blocks there *might* be if MOVABLE pages were
+ * migrated. Calculating that is possible, but expensive and can be
+ * figured out from userspace
+ */
+static void fill_contig_page_info(struct zone *zone,
+				unsigned int suitable_order,
+				struct contig_page_info *info)
+{
+	unsigned int order;
+
+	info->free_pages = 0;
+	info->free_blocks_total = 0;
+	info->free_blocks_suitable = 0;
+
+	for (order = 0; order < MAX_ORDER; order++) {
+		unsigned long blocks;
+
+		/* Count number of free blocks */
+		blocks = zone->free_area[order].nr_free;
+		info->free_blocks_total += blocks;
+
+		/* Count free base pages */
+		info->free_pages += blocks << order;
+
+		/* Count the suitable free blocks */
+		if (order >= suitable_order)
+			info->free_blocks_suitable += blocks <<
+						(order - suitable_order);
+	}
+}
+
+/*
+ * Return an index indicating how much of the available free memory is
+ * unusable for an allocation of the requested size.
+ */
+static int unusable_free_index(struct zone *zone,
+				unsigned int order,
+				struct contig_page_info *info)
+{
+	/* No free memory is interpreted as all free memory is unusable */
+	if (info->free_pages == 0)
+		return 1000;
+
+	/*
+	 * Index should be a value between 0 and 1. Return a value to 3
+	 * decimal places.
+	 *
+	 * 0 => no fragmentation
+	 * 1 => high fragmentation
+	 */
+	return ((info->free_pages - (info->free_blocks_suitable << order)) * 1000) / info->free_pages;
+
+}
+
+static void pagetypeinfo_showunusable_print(struct seq_file *m,
+					pg_data_t *pgdat, struct zone *zone)
+{
+	unsigned int order;
+
+	/* Alloc on stack as interrupts are disabled for zone walk */
+	struct contig_page_info info;
+
+	seq_printf(m, "Node %4d, zone %8s %19s",
+				pgdat->node_id,
+				zone->name, " ");
+	for (order = 0; order < MAX_ORDER; ++order) {
+		fill_contig_page_info(zone, order, &info);
+		seq_printf(m, "%6d ", unusable_free_index(zone, order, &info));
+	}
+
+	seq_putc(m, '\n');
+}
+
+/*
+ * Display unusable free space index
+ * XXX: Could be a lot more efficient, but it's not a critical path
+ */
+static int pagetypeinfo_showunusable(struct seq_file *m, void *arg)
+{
+	pg_data_t *pgdat = (pg_data_t *)arg;
+
+	seq_printf(m, "\nUnusable free space index at order\n");
+	walk_zones_in_node(m, pgdat, pagetypeinfo_showunusable_print);
+
+	return 0;
+}
+
 static void pagetypeinfo_showfree_print(struct seq_file *m,
 					pg_data_t *pgdat, struct zone *zone)
 {
@@ -558,6 +656,7 @@ static int pagetypeinfo_show(struct seq_file *m, void *arg)
 	seq_printf(m, "Pages per block:  %lu\n", pageblock_nr_pages);
 	seq_putc(m, '\n');
 	pagetypeinfo_showfree(m, pgdat);
+	pagetypeinfo_showunusable(m, pgdat);
 	pagetypeinfo_showblockcount(m, pgdat);
 
 	return 0;
-- 
1.6.5

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-02-12 12:01 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-12 12:00 [PATCH 0/12] Memory Compaction v2r12 Mel Gorman
2010-02-12 12:00 ` [PATCH 01/12] mm: Document /proc/pagetypeinfo Mel Gorman
2010-02-12 15:54   ` Christoph Lameter
2010-02-16  7:05   ` KOSAKI Motohiro
2010-02-12 12:00 ` [PATCH 02/12] Allow CONFIG_MIGRATION to be set without CONFIG_NUMA or memory hot-remove Mel Gorman
2010-02-16 17:43   ` Rik van Riel
2010-02-12 12:00 ` Mel Gorman [this message]
2010-02-16  7:03   ` [PATCH 03/12] Export unusable free space index via /proc/pagetypeinfo KOSAKI Motohiro
2010-02-16  8:36     ` Mel Gorman
2010-02-16  8:41       ` KOSAKI Motohiro
2010-02-16  8:50         ` Mel Gorman
2010-02-16 18:28   ` Rik van Riel
2010-02-18 15:23   ` Minchan Kim
2010-02-18 15:32     ` Mel Gorman
2010-02-12 12:00 ` [PATCH 04/12] Export fragmentation " Mel Gorman
2010-02-16  7:59   ` KOSAKI Motohiro
2010-02-16  8:41     ` Mel Gorman
2010-02-16  8:49       ` KOSAKI Motohiro
2010-02-17  1:44   ` Rik van Riel
2010-02-18 15:37   ` Minchan Kim
2010-02-12 12:00 ` [PATCH 05/12] Memory compaction core Mel Gorman
2010-02-16  8:31   ` KOSAKI Motohiro
2010-02-16  8:48     ` Mel Gorman
2010-02-16 14:55       ` Christoph Lameter
2010-02-16 14:59         ` Mel Gorman
2010-02-18 19:37           ` Christoph Lameter
2010-02-18 21:35             ` Mel Gorman
2010-02-19  0:04             ` KAMEZAWA Hiroyuki
2010-02-17 13:29     ` Mel Gorman
2010-02-17 15:45       ` Rik van Riel
2010-02-18 16:58   ` Minchan Kim
2010-02-18 17:34     ` Mel Gorman
2010-02-19  1:21       ` Minchan Kim
2010-02-19 14:33         ` Mel Gorman
2010-02-12 12:00 ` [PATCH 06/12] Add /proc trigger for memory compaction Mel Gorman
2010-02-12 18:34   ` Valdis.Kletnieks
2010-02-12 18:38     ` Mel Gorman
2010-02-17 16:30   ` Rik van Riel
2010-02-18 19:51   ` Christoph Lameter
2010-02-19  1:56   ` Minchan Kim
2010-02-12 12:00 ` [PATCH 07/12] Add /sys trigger for per-node " Mel Gorman
2010-02-17 16:30   ` Rik van Riel
2010-02-12 12:00 ` [PATCH 08/12] Direct compact when a high-order allocation fails Mel Gorman
2010-02-18  3:57   ` Rik van Riel
2010-02-12 12:00 ` [PATCH 09/12] Do not compact within a preferred zone after a compaction failure Mel Gorman
2010-02-18  4:09   ` Rik van Riel
2010-02-12 12:00 ` [PATCH 10/12] mm: Check for an empty VMA list in rmap_walk_anon Mel Gorman
2010-02-17 18:22   ` Mel Gorman
2010-02-12 12:00 ` [PATCH 11/12] mm: Take the RCU read lock " Mel Gorman
2010-02-12 12:00 ` [PATCH 12/12] mm: Check the anon_vma is still valid in rmap_walk_anon() Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1265976059-7459-4-git-send-email-mel@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=aarcange@redhat.com \
    --cc=agl@us.ibm.com \
    --cc=avi@redhat.com \
    --cc=cl@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).