All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/10] Support free page filtering looking up mem_map array
@ 2012-06-28 17:37 HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
                   ` (11 more replies)
  0 siblings, 12 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:37 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

Sorry for late posting. I made RFC patch set for free page filtering
looking up mem_map array. Unlike exiting method looking up free page
list, this is done in constant space.

I intend this patch set to be merged with Kumagai-san's cyclic patch
set, so I mark these with RFC. See TODO below. Also, I have yet to
test the logic for old kernels from v2.6.15 to v2.6.17.

This new free page filtering needs the following values.

  - OFFSET(page._mapcount)
  - OFFSET(page.private)
  - SIZE(pageflags)
  - NUMBER(PG_buddy)
  - NUMBER(PG_slab)
  - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)

Unfortunately, OFFSET(_mapcount) and OFFSET(private) fields of page
structure cannot be obtained from VMLINUX using the exiting library in
makedumpfile since two members are anonymous components of union
types. We need a new interface for them.

To try to use this patch set, it's handy to pass manually editted
VMCOREINFO file via -i option.

TODO

  1. Add new values in VMCOREINFO on the upstream kernel.

  2. Decide when to use this logic instead of the existing free list
  logic. Option is 1) introduce new dump level or 2) use it
  automatically if --cyclic is specified. This patch chooses 1) only
  for RFC use.

  3. Consider how to deal with old kernels on which we cannot add the
  values in VMCOREINFO. Options is 1) to force users to use VMLINUX,
  2) to cover them full hard coding or 3) to give up support on full
  range of kernel versions ever.

---

HATAYAMA Daisuke (10):
      Add page_is_buddy for old kernels
      Add page_is_buddy for PG_buddy
      Add page_is_buddy for recent kernels
      Add excldue free pages by looking up mem_map array
      Add command-line processing for free page filtering looking up mem_map array
      Add page flag values as hardcoded values
      Add debuginfo-related processing for VMCOREINFO/VMLINUX
      Add new parameters for various tables
      Add debuginfo interface for enum type size
      Move page flags setup for old kernels after debuginfo initialization


 dwarf_info.c   |   29 +++++++++++----
 dwarf_info.h   |    1 +
 makedumpfile.c |  111 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 makedumpfile.h |   34 ++++++++++++++++-
 4 files changed, 158 insertions(+), 17 deletions(-)

-- 

Thanks.
HATAYAMA, Daisuke

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH 01/10] Move page flags setup for old kernels after debuginfo initialization
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 02/10] Add debuginfo interface for enum type size HATAYAMA Daisuke
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

The hard coded values need to be used only if the corresponding values
are not specified by debuginfo. So it should be done after debuginfo
initialization.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index d024e95..a6b3de7 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -2643,8 +2643,6 @@ initial(void)
 		debug_info = TRUE;
 	}
 
-	if (!get_value_for_old_linux())
-		return FALSE;
 out:
 	if (!info->page_size) {
 		/*
@@ -2706,6 +2704,9 @@ out:
 			return FALSE;
 	}
 
+	if (!get_value_for_old_linux())
+		return FALSE;
+
 	return TRUE;
 }
 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 02/10] Add debuginfo interface for enum type size
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 03/10] Add new parameters for various tables HATAYAMA Daisuke
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

This is needed in this patch set to determine whether or not a certain
enumeration type exists from a given debuginfo. The interface is a
simple extension from the existing one for enumeration value.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 dwarf_info.c   |   29 +++++++++++++++++++++--------
 dwarf_info.h   |    1 +
 makedumpfile.h |   13 +++++++++++--
 3 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/dwarf_info.c b/dwarf_info.c
index 1429858..98efc26 100644
--- a/dwarf_info.c
+++ b/dwarf_info.c
@@ -75,7 +75,8 @@ is_search_structure(int cmd)
 static int
 is_search_number(int cmd)
 {
-	if (cmd == DWARF_INFO_GET_ENUM_NUMBER)
+	if ((cmd == DWARF_INFO_GET_ENUM_NUMBER)
+	    || (cmd == DWARF_INFO_GET_ENUMERATION_TYPE_SIZE))
 		return TRUE;
 	else
 		return FALSE;
@@ -647,7 +648,7 @@ search_structure(Dwarf_Die *die, int *found)
 static void
 search_number(Dwarf_Die *die, int *found)
 {
-	int tag;
+	int tag, bytesize;
 	Dwarf_Word const_value;
 	Dwarf_Attribute attr;
 	Dwarf_Die child, *walker;
@@ -658,6 +659,22 @@ search_number(Dwarf_Die *die, int *found)
 		if (tag != DW_TAG_enumeration_type)
 			continue;
 
+		if (dwarf_info.cmd == DWARF_INFO_GET_ENUMERATION_TYPE_SIZE) {
+			name = dwarf_diename(die);
+
+			if (!name || strcmp(name, dwarf_info.struct_name))
+				continue;
+
+			if ((bytesize = dwarf_bytesize(die)) <= 0)
+				continue;
+
+			*found = TRUE;
+
+			dwarf_info.struct_size = bytesize;
+
+			return;
+		}
+
 		if (dwarf_child(die, &child) != 0)
 			continue;
 
@@ -1026,13 +1043,9 @@ out:
  * Get the size of structure.
  */
 long
-get_structure_size(char *structname, int flag_typedef)
+get_structure_size(char *structname, int cmd)
 {
-	if (flag_typedef)
-		dwarf_info.cmd = DWARF_INFO_GET_TYPEDEF_SIZE;
-	else
-		dwarf_info.cmd = DWARF_INFO_GET_STRUCT_SIZE;
-
+	dwarf_info.cmd = cmd;
 	dwarf_info.struct_name = structname;
 	dwarf_info.struct_size = NOT_FOUND_STRUCTURE;
 
diff --git a/dwarf_info.h b/dwarf_info.h
index 1e07484..b445738 100644
--- a/dwarf_info.h
+++ b/dwarf_info.h
@@ -47,6 +47,7 @@ enum {
 	DWARF_INFO_CHECK_SYMBOL_ARRAY_TYPE,
 	DWARF_INFO_GET_SYMBOL_TYPE,
 	DWARF_INFO_GET_MEMBER_TYPE,
+	DWARF_INFO_GET_ENUMERATION_TYPE_SIZE,
 };
 
 char *get_dwarf_module_name(void);
diff --git a/makedumpfile.h b/makedumpfile.h
index 6f5489d..95c0abc 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -225,14 +225,23 @@ do { \
 #define ARRAY_LENGTH(X)		(array_table.X)
 #define SIZE_INIT(X, Y) \
 do { \
-	if ((SIZE(X) = get_structure_size(Y, 0)) == FAILED_DWARFINFO) \
+	if ((SIZE(X) = get_structure_size(Y, DWARF_INFO_GET_STRUCT_SIZE)) \
+	     == FAILED_DWARFINFO) \
 		return FALSE; \
 } while (0)
 #define TYPEDEF_SIZE_INIT(X, Y) \
 do { \
-	if ((SIZE(X) = get_structure_size(Y, 1)) == FAILED_DWARFINFO) \
+	if ((SIZE(X) = get_structure_size(Y, DWARF_INFO_GET_TYPEDEF_SIZE)) \
+	     == FAILED_DWARFINFO) \
 		return FALSE; \
 } while (0)
+#define ENUM_TYPE_SIZE_INIT(X, Y) \
+do { \
+	if ((SIZE(X) = get_structure_size(Y,	\
+		DWARF_INFO_GET_ENUMERATION_TYPE_SIZE))	\
+	     == FAILED_DWARFINFO)			\
+ 		return FALSE; \
+ } while (0)
 #define OFFSET_INIT(X, Y, Z) \
 do { \
 	if ((OFFSET(X) = get_member_offset(Y, Z, DWARF_INFO_GET_MEMBER_OFFSET)) \


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 03/10] Add new parameters for various tables
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 02/10] Add debuginfo interface for enum type size HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX HATAYAMA Daisuke
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.h |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.h b/makedumpfile.h
index 95c0abc..2808871 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1073,6 +1073,8 @@ struct size_table {
 	long	cpumask_t;
 	long	kexec_segment;
 	long	elf64_hdr;
+
+	long	pageflags;
 };
 
 struct offset_table {
@@ -1081,6 +1083,8 @@ struct offset_table {
 		long	_count;
 		long	mapping;
 		long	lru;
+		long	_mapcount;
+		long	private;
 	} page;
 	struct mem_section {
 		long	section_mem_map;
@@ -1242,6 +1246,10 @@ struct number_table {
 	long	PG_lru;
 	long	PG_private;
 	long	PG_swapcache;
+	long	PG_buddy;
+	long	PG_slab;
+
+	long	PAGE_BUDDY_MAPCOUNT_VALUE;
 };
 
 struct srcfile_table {


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (2 preceding siblings ...)
  2012-06-28 17:38 ` [RFC PATCH 03/10] Add new parameters for various tables HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 05/10] Add page flag values as hardcoded values HATAYAMA Daisuke
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index a6b3de7..d8da608 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -876,6 +876,8 @@ get_structure_info(void)
 	OFFSET_INIT(page._count, "page", "_count");
 
 	OFFSET_INIT(page.mapping, "page", "mapping");
+	OFFSET_INIT(page._mapcount, "page", "_mapcount");
+	OFFSET_INIT(page.private, "page", "private");
 
 	/*
 	 * On linux-2.6.16 or later, page.mapping is defined
@@ -974,6 +976,10 @@ get_structure_info(void)
 	ENUM_NUMBER_INIT(PG_lru, "PG_lru");
 	ENUM_NUMBER_INIT(PG_private, "PG_private");
 	ENUM_NUMBER_INIT(PG_swapcache, "PG_swapcache");
+	ENUM_NUMBER_INIT(PG_buddy, "PG_buddy");
+	ENUM_NUMBER_INIT(PG_slab, "PG_slab");
+
+	ENUM_TYPE_SIZE_INIT(pageflags, "pageflags");
 
 	TYPEDEF_SIZE_INIT(nodemask_t, "nodemask_t");
 
@@ -1324,6 +1330,7 @@ write_vmcoreinfo_data(void)
 	WRITE_STRUCTURE_SIZE("list_head", list_head);
 	WRITE_STRUCTURE_SIZE("node_memblk_s", node_memblk_s);
 	WRITE_STRUCTURE_SIZE("nodemask_t", nodemask_t);
+	WRITE_STRUCTURE_SIZE("pageflags", pageflags);
 
 	/*
 	 * write the member offset of 1st kernel
@@ -1332,6 +1339,8 @@ write_vmcoreinfo_data(void)
 	WRITE_MEMBER_OFFSET("page._count", page._count);
 	WRITE_MEMBER_OFFSET("page.mapping", page.mapping);
 	WRITE_MEMBER_OFFSET("page.lru", page.lru);
+	WRITE_MEMBER_OFFSET("page._mapcount", page._mapcount);
+	WRITE_MEMBER_OFFSET("page.private", page.private);
 	WRITE_MEMBER_OFFSET("mem_section.section_mem_map",
 	    mem_section.section_mem_map);
 	WRITE_MEMBER_OFFSET("pglist_data.node_zones", pglist_data.node_zones);
@@ -1376,6 +1385,10 @@ write_vmcoreinfo_data(void)
 	WRITE_NUMBER("PG_lru", PG_lru);
 	WRITE_NUMBER("PG_private", PG_private);
 	WRITE_NUMBER("PG_swapcache", PG_swapcache);
+	WRITE_NUMBER("PG_buddy", PG_buddy);
+	WRITE_NUMBER("PG_slab", PG_slab);
+
+	WRITE_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
 
 	/*
 	 * write the source file of 1st kernel
@@ -1623,11 +1636,14 @@ read_vmcoreinfo(void)
 	READ_STRUCTURE_SIZE("list_head", list_head);
 	READ_STRUCTURE_SIZE("node_memblk_s", node_memblk_s);
 	READ_STRUCTURE_SIZE("nodemask_t", nodemask_t);
+	READ_STRUCTURE_SIZE("pageflags", pageflags);
 
 	READ_MEMBER_OFFSET("page.flags", page.flags);
 	READ_MEMBER_OFFSET("page._count", page._count);
 	READ_MEMBER_OFFSET("page.mapping", page.mapping);
 	READ_MEMBER_OFFSET("page.lru", page.lru);
+	READ_MEMBER_OFFSET("page._mapcount", page._mapcount);
+	READ_MEMBER_OFFSET("page.private", page.private);
 	READ_MEMBER_OFFSET("mem_section.section_mem_map",
 	    mem_section.section_mem_map);
 	READ_MEMBER_OFFSET("pglist_data.node_zones", pglist_data.node_zones);
@@ -1664,6 +1680,10 @@ read_vmcoreinfo(void)
 	READ_NUMBER("PG_lru", PG_lru);
 	READ_NUMBER("PG_private", PG_private);
 	READ_NUMBER("PG_swapcache", PG_swapcache);
+	READ_NUMBER("PG_slab", PG_slab);
+	READ_NUMBER("PG_buddy", PG_buddy);
+
+	READ_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
 
 	READ_SRCFILE("pud_t", pud_t);
 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 05/10] Add page flag values as hardcoded values
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (3 preceding siblings ...)
  2012-06-28 17:38 ` [RFC PATCH 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 06/10] Add command-line processing for free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

These should normally be exported as VMCOREINFO from kernel, but we
cannot modify old kernels and still they are easy to be determined
becasue of a small kernel version dependency with no symbol and type
information.

- PG_slab has had the same value 7 since v2.6.15, and

- PG_buddy has been defined as macro value from v2.6.17 to v2.6.26

  On other versions, pageflags enumeration type is introduced and at
  the version, PG_buddy's value depends on CONFIG_PAGEFLAGS_EXTEND;
  luckily, this can be determined by looking at error_states array,
  for example, but I don't implement it in this patch.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |    6 ++++++
 makedumpfile.h |    3 +++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index d8da608..bed74df 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1179,6 +1179,12 @@ get_value_for_old_linux(void)
 		NUMBER(PG_private) = PG_private_ORIGINAL;
 	if (NUMBER(PG_swapcache) == NOT_FOUND_NUMBER)
 		NUMBER(PG_swapcache) = PG_swapcache_ORIGINAL;
+	if (NUMBER(PG_slab) == NOT_FOUND_NUMBER)
+		NUMBER(PG_slab) = PG_slab_ORIGINAL;
+	if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER
+	    && info->kernel_version >= KERNEL_VERSION(2, 6, 17)
+	    && info->kernel_version <= KERNEL_VERSION(2, 6, 26))
+		NUMBER(PG_buddy) = PG_buddy_v2_6_17_to_v2_6_26;
 	return TRUE;
 }
 
diff --git a/makedumpfile.h b/makedumpfile.h
index 2808871..3059a9e 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -68,9 +68,12 @@ int get_mem_type(void);
  * The following values are for linux-2.6.25 or former.
  */
 #define PG_lru_ORIGINAL	 	(5)
+#define PG_slab_ORIGINAL	(7)
 #define PG_private_ORIGINAL	(11)	/* Has something at ->private */
 #define PG_swapcache_ORIGINAL	(15)	/* Swap page: swp_entry_t in private */
 
+#define PG_buddy_v2_6_17_to_v2_6_26	(19)
+
 #define PAGE_MAPPING_ANON	(1)
 
 #define LSEEKED_BITMAP	(1)


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 06/10] Add command-line processing for free page filtering looking up mem_map array
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (4 preceding siblings ...)
  2012-06-28 17:38 ` [RFC PATCH 05/10] Add page flag values as hardcoded values HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:38 ` [RFC PATCH 07/10] Add excldue free pages by " HATAYAMA Daisuke
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

I choose dump level 32 for free page filtering looking up mem_map
array. But I don't mean this is the final version now, I mean
experimental use for RFC, only. I prefer the setting that this logic
looking up mem_map array is automatically used if dump level 16 and
--cyclic option are specified at the same time.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |    3 ++-
 makedumpfile.h |    4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index bed74df..d1eded0 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3853,7 +3853,8 @@ create_2nd_bitmap(void)
 	 */
 	if (info->dump_level & DL_EXCLUDE_CACHE ||
 	    info->dump_level & DL_EXCLUDE_CACHE_PRI ||
-	    info->dump_level & DL_EXCLUDE_USER_DATA) {
+	    info->dump_level & DL_EXCLUDE_USER_DATA ||
+	    info->dump_level & DL_EXCLUDE_FREE_CONST) {
 		if (!exclude_unnecessary_pages()) {
 			ERRMSG("Can't exclude unnecessary pages.\n");
 			return FALSE;
diff --git a/makedumpfile.h b/makedumpfile.h
index 3059a9e..404f00e 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -146,7 +146,7 @@ isAnon(unsigned long mapping)
  * Dump Level
  */
 #define MIN_DUMP_LEVEL		(0)
-#define MAX_DUMP_LEVEL		(31)
+#define MAX_DUMP_LEVEL		(63)
 #define NUM_ARRAY_DUMP_LEVEL	(MAX_DUMP_LEVEL + 1) /* enough to allocate
 							all the dump_level */
 #define DL_EXCLUDE_ZERO		(0x001) /* Exclude Pages filled with Zeros */
@@ -156,6 +156,8 @@ isAnon(unsigned long mapping)
 				           with Private Pages */
 #define DL_EXCLUDE_USER_DATA	(0x008) /* Exclude UserProcessData Pages */
 #define DL_EXCLUDE_FREE		(0x010)	/* Exclude Free Pages */
+#define DL_EXCLUDE_FREE_CONST	(0x020)	/* Exclude Free Pages looking
+					   up mem_map */
 
 
 /*


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 07/10] Add excldue free pages by looking up mem_map array
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (5 preceding siblings ...)
  2012-06-28 17:38 ` [RFC PATCH 06/10] Add command-line processing for free page filtering looking up mem_map array HATAYAMA Daisuke
@ 2012-06-28 17:38 ` HATAYAMA Daisuke
  2012-06-28 17:39 ` [RFC PATCH 08/10] Add page_is_buddy for recent kernels HATAYAMA Daisuke
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:38 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

Add free page filtering logic in the processing looking up mem_map
array. page_is_buddy handler is newly introduced, which abstracts
condition of determining buddy page that varies depending on kernel
versions. On makedumpfile suppoting range of kernel versions, there
are three kinds of page_is_buddy, which are introduced in later
patches. Also, _mapcount and private fields of page struct are newly
introduced, which are used in the page_is_buddy handlers.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |   21 ++++++++++++++++++---
 makedumpfile.h |    6 ++++++
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index d1eded0..37c371e 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3689,8 +3689,8 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 	unsigned long long pfn_read_start, pfn_read_end, index_pg;
 	unsigned char page_cache[SIZE(page) * PGMM_CACHED];
 	unsigned char *pcache;
-	unsigned int _count;
-	unsigned long flags, mapping;
+	unsigned int _count, _mapcount;
+	unsigned long flags, mapping, private;
 
 	/*
 	 * Refresh the buffer of struct page, when changing mem_map.
@@ -3738,11 +3738,26 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 		flags   = ULONG(pcache + OFFSET(page.flags));
 		_count  = UINT(pcache + OFFSET(page._count));
 		mapping = ULONG(pcache + OFFSET(page.mapping));
+		_mapcount = UINT(pcache + OFFSET(page._mapcount));
+		private = ULONG(pcache + OFFSET(page.private));
 
+ 		/*
+		 * Exclude the free page managed by a buddy.
+		 */
+		if ((info->dump_level & DL_EXCLUDE_FREE_CONST)
+		    && info->page_is_buddy
+		    && info->page_is_buddy(flags, _mapcount, private,
+					   _count)) {
+			int i;
+
+			for (i = 0; i < (1<<private); ++i)
+				clear_bit_on_2nd_bitmap_for_kernel(pfn + i);
+			pfn_free += i;
+		}
 		/*
 		 * Exclude the cache page without the private page.
 		 */
-		if ((info->dump_level & DL_EXCLUDE_CACHE)
+		else if ((info->dump_level & DL_EXCLUDE_CACHE)
 		    && (isLRU(flags) || isSwapCache(flags))
 		    && !isPrivate(flags) && !isAnon(mapping)) {
 			clear_bit_on_2nd_bitmap_for_kernel(pfn);
diff --git a/makedumpfile.h b/makedumpfile.h
index 404f00e..d867dfe 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -943,6 +943,12 @@ struct DumpInfo {
 	 */
 	int flag_sadump_diskset;
 	enum sadump_format_type flag_sadump;         /* sadump format type */
+
+	/*
+	 * for filtering free pages managed by buddy system:
+	 */
+	int (*page_is_buddy)(unsigned long flags, unsigned int _mapcount,
+			     unsigned long private, unsigned int _count);
 };
 extern struct DumpInfo		*info;
 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 08/10] Add page_is_buddy for recent kernels
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (6 preceding siblings ...)
  2012-06-28 17:38 ` [RFC PATCH 07/10] Add excldue free pages by " HATAYAMA Daisuke
@ 2012-06-28 17:39 ` HATAYAMA Daisuke
  2012-06-28 17:39 ` [RFC PATCH 09/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:39 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

On kernels from v2.6.38 and later kernels, buddy page is marked by
_mapcount == PAGE_BUDDY_MAPCOUNT_VALUE, which varies once as follows:

  kernel version    | PAGE_BUDDY_MAPCOUNT_VALUE
  ------------------+--------------------------
  v2.6.38           | -2
  v2.6.39 and later | -128

One more notice is that _mapcount shares its memory with other fields
for SLAB/SLUB when PG_slab is set. We need to check if PG_slab is set
or not before looking up _mapcount value.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |   30 ++++++++++++++++++++++++++++++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 37c371e..aea956f 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -55,6 +55,8 @@ do { \
 		*ptr_long_table = value; \
 } while (0)
 
+static void setup_page_is_buddy(void);
+
 void
 initialize_tables(void)
 {
@@ -2733,6 +2735,9 @@ out:
 	if (!get_value_for_old_linux())
 		return FALSE;
 
+	if (info->dump_level & DL_EXCLUDE_FREE_CONST)
+		setup_page_is_buddy();
+
 	return TRUE;
 }
 
@@ -3512,6 +3517,31 @@ exclude_free_page(void)
 	return TRUE;
 }
 
+static int
+page_is_buddy_v3(unsigned long flags, unsigned int _mapcount,
+		 unsigned long private, unsigned int _count)
+{
+	if (flags & (1UL << NUMBER(PG_slab)))
+		return FALSE;
+
+	if (_mapcount == (int)NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE))
+		return TRUE;
+
+	return FALSE;
+}
+
+static void
+setup_page_is_buddy(void)
+{
+	if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER
+	    && SIZE(pageflags) != NOT_FOUND_STRUCTURE
+	    && NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER)
+		info->page_is_buddy = page_is_buddy_v3;
+
+	MSG("Can't select page_is_buddy handler; "
+	    "filtering free pages is disabled.\n");
+}
+
 /*
  * If using a dumpfile in kdump-compressed format as a source file
  * instead of /proc/vmcore, 1st-bitmap of a new dumpfile must be


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 09/10] Add page_is_buddy for PG_buddy
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (7 preceding siblings ...)
  2012-06-28 17:39 ` [RFC PATCH 08/10] Add page_is_buddy for recent kernels HATAYAMA Daisuke
@ 2012-06-28 17:39 ` HATAYAMA Daisuke
  2012-06-28 17:39 ` [RFC PATCH 10/10] Add page_is_buddy for old kernels HATAYAMA Daisuke
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:39 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

On kernels from v2.6.18 to v2.6.37, buddy page is marked by the
PG_buddy flag.

  kernel version     | PG_buddy
  ------------------ +---------------------------------
  v2.6.17 to v2.6.26 | 19
  v2.6.27 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
                     | 18 otherwise

Note for hard coding: it's possible to determine whether
CONFIG_PAGEFLAGS_EXTEND=yes or not by looking at error_status array
defined in mm/memory-failure.c, for example.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |   27 ++++++++++++++++++++-------
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index aea956f..675b47e 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3518,6 +3518,16 @@ exclude_free_page(void)
 }
 
 static int
+page_is_buddy_v2(unsigned long flags, unsigned int _mapcount,
+		 unsigned long private, unsigned int _count)
+{
+	if (flags & (1UL << NUMBER(PG_buddy)))
+		return TRUE;
+
+	return FALSE;
+}
+
+static int
 page_is_buddy_v3(unsigned long flags, unsigned int _mapcount,
 		 unsigned long private, unsigned int _count)
 {
@@ -3533,13 +3543,16 @@ page_is_buddy_v3(unsigned long flags, unsigned int _mapcount,
 static void
 setup_page_is_buddy(void)
 {
-	if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER
-	    && SIZE(pageflags) != NOT_FOUND_STRUCTURE
-	    && NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER)
-		info->page_is_buddy = page_is_buddy_v3;
-
-	MSG("Can't select page_is_buddy handler; "
-	    "filtering free pages is disabled.\n");
+	if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER) {
+		if (SIZE(pageflags) != NOT_FOUND_STRUCTURE
+		    && NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER)
+			info->page_is_buddy = page_is_buddy_v3;
+		else {
+			MSG("Can't select page_is_buddy handler; "
+			    "filtering free pages is disabled.\n");
+		}
+	} else
+		info->page_is_buddy = page_is_buddy_v2;
 }
 
 /*


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH 10/10] Add page_is_buddy for old kernels
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (8 preceding siblings ...)
  2012-06-28 17:39 ` [RFC PATCH 09/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
@ 2012-06-28 17:39 ` HATAYAMA Daisuke
  2012-06-29  3:07 ` [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
  2012-06-29  6:23 ` Atsushi Kumagai
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-28 17:39 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

On kernels from v2.6.15 to v2.6.17 buddy page is marked by the
condition that PG_private flag is set and _count == 0.

Unfortunately, I have yet to test this logic on these kernel versions
simply because I've been failing to boot them on my box.

Note that on these kernels, free list can be corrupted due to the bug
that the above two conditions are not checked atomically. The reason
why PG_buddy was introduced is a fix for this bug. Thus, the bug can
also affect the logic based on mem_map array, which we cannot avoid
definitely.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---

 makedumpfile.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 675b47e..b73cc64 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3518,6 +3518,18 @@ exclude_free_page(void)
 }
 
 static int
+page_is_buddy_v1(unsigned long flags, unsigned int _mapcount,
+		 unsigned long private, unsigned int _count)
+{
+	if ((flags & (1UL << NUMBER(PG_private)))
+	    && _count == 0
+	    && private <= ARRAY_LENGTH(zone.free_area))
+		return TRUE;
+
+	return FALSE;
+}
+
+static int
 page_is_buddy_v2(unsigned long flags, unsigned int _mapcount,
 		 unsigned long private, unsigned int _count)
 {
@@ -3544,8 +3556,9 @@ static void
 setup_page_is_buddy(void)
 {
 	if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER) {
-		if (SIZE(pageflags) != NOT_FOUND_STRUCTURE
-		    && NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER)
+		if (SIZE(pageflags) == NOT_FOUND_STRUCTURE)
+			info->page_is_buddy = page_is_buddy_v1;
+		else if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER)
 			info->page_is_buddy = page_is_buddy_v3;
 		else {
 			MSG("Can't select page_is_buddy handler; "


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 00/10] Support free page filtering looking up mem_map array
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (9 preceding siblings ...)
  2012-06-28 17:39 ` [RFC PATCH 10/10] Add page_is_buddy for old kernels HATAYAMA Daisuke
@ 2012-06-29  3:07 ` HATAYAMA Daisuke
  2012-06-29  6:23 ` Atsushi Kumagai
  11 siblings, 0 replies; 14+ messages in thread
From: HATAYAMA Daisuke @ 2012-06-29  3:07 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: kexec

From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Subject: [RFC PATCH 00/10] Support free page filtering looking up mem_map array
Date: Fri, 29 Jun 2012 02:37:57 +0900

> This new free page filtering needs the following values.
> 
>   - OFFSET(page._mapcount)
>   - OFFSET(page.private)
>   - SIZE(pageflags)
>   - NUMBER(PG_buddy)
>   - NUMBER(PG_slab)
>   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> 
> Unfortunately, OFFSET(_mapcount) and OFFSET(private) fields of page
> structure cannot be obtained from VMLINUX using the exiting library in
> makedumpfile since two members are anonymous components of union
> types. We need a new interface for them.
> 
> To try to use this patch set, it's handy to pass manually editted
> VMCOREINFO file via -i option.
> 

Concretely, I added the following values in VMCOREINFO generated with
-g option or using strings command directly to vmcore.

 * 3.1.0-7.fc16.x86_64

NUMBER(PG_slab)=7
SIZE(pageflags)=4
OFFSET(page._mapcount)=24
OFFSET(page.private)=48
NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-128

 * 2.6.38
SIZE(pageflags)=4
OFFSET(page._mapcount)=12
OFFSET(page.private)=16
NUMBER(PG_slab)=7
NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-2

 * 2.6.32
NUMBER(PG_slab)=7
NUMBER(PG_buddy)=19
OFFSET(page._mapcount)=12
OFFSET(page.private)=16
SIZE(pageflags)=4

 * 2.6.18
NUMBER(PG_slab)=7
NUMBER(PG_buddy)=19
OFFSET(page._mapcount)=12
OFFSET(page.private)=16

Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 00/10] Support free page filtering looking up mem_map array
  2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
                   ` (10 preceding siblings ...)
  2012-06-29  3:07 ` [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
@ 2012-06-29  6:23 ` Atsushi Kumagai
  2012-07-13  5:23   ` Atsushi Kumagai
  11 siblings, 1 reply; 14+ messages in thread
From: Atsushi Kumagai @ 2012-06-29  6:23 UTC (permalink / raw)
  To: d.hatayama; +Cc: kexec

Hello HATAYAMA-san,

On Fri, 29 Jun 2012 02:37:57 +0900
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:

> Sorry for late posting. I made RFC patch set for free page filtering
> looking up mem_map array. Unlike exiting method looking up free page
> list, this is done in constant space.
> 
> I intend this patch set to be merged with Kumagai-san's cyclic patch
> set, so I mark these with RFC. See TODO below. Also, I have yet to
> test the logic for old kernels from v2.6.15 to v2.6.17.
> 
> This new free page filtering needs the following values.
> 
>   - OFFSET(page._mapcount)
>   - OFFSET(page.private)
>   - SIZE(pageflags)
>   - NUMBER(PG_buddy)
>   - NUMBER(PG_slab)
>   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> 
> Unfortunately, OFFSET(_mapcount) and OFFSET(private) fields of page
> structure cannot be obtained from VMLINUX using the exiting library in
> makedumpfile since two members are anonymous components of union
> types. We need a new interface for them.
> 
> To try to use this patch set, it's handy to pass manually editted
> VMCOREINFO file via -i option.
> 
> TODO
> 
>   1. Add new values in VMCOREINFO on the upstream kernel.
> 
>   2. Decide when to use this logic instead of the existing free list
>   logic. Option is 1) introduce new dump level or 2) use it
>   automatically if --cyclic is specified. This patch chooses 1) only
>   for RFC use.
> 
>   3. Consider how to deal with old kernels on which we cannot add the
>   values in VMCOREINFO. Options is 1) to force users to use VMLINUX,
>   2) to cover them full hard coding or 3) to give up support on full
>   range of kernel versions ever.

Thank you always for your work.

I will review your patches and measure executing time with v2 patch
of cyclic processing. If your patches are effective, then I will 
consider TODO above.
Please wait for a while.


Thanks
Atsushi Kumagai

> ---
> 
> HATAYAMA Daisuke (10):
>       Add page_is_buddy for old kernels
>       Add page_is_buddy for PG_buddy
>       Add page_is_buddy for recent kernels
>       Add excldue free pages by looking up mem_map array
>       Add command-line processing for free page filtering looking up mem_map array
>       Add page flag values as hardcoded values
>       Add debuginfo-related processing for VMCOREINFO/VMLINUX
>       Add new parameters for various tables
>       Add debuginfo interface for enum type size
>       Move page flags setup for old kernels after debuginfo initialization
> 
> 
>  dwarf_info.c   |   29 +++++++++++----
>  dwarf_info.h   |    1 +
>  makedumpfile.c |  111 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  makedumpfile.h |   34 ++++++++++++++++-
>  4 files changed, 158 insertions(+), 17 deletions(-)
> 
> -- 
> 
> Thanks.
> HATAYAMA, Daisuke

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH 00/10] Support free page filtering looking up mem_map array
  2012-06-29  6:23 ` Atsushi Kumagai
@ 2012-07-13  5:23   ` Atsushi Kumagai
  0 siblings, 0 replies; 14+ messages in thread
From: Atsushi Kumagai @ 2012-07-13  5:23 UTC (permalink / raw)
  To: kexec, d.hatayama

Hello,

On Fri, 29 Jun 2012 15:23:37 +0900
Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> wrote:

> Hello HATAYAMA-san,
> 
> On Fri, 29 Jun 2012 02:37:57 +0900
> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> 
> > Sorry for late posting. I made RFC patch set for free page filtering
> > looking up mem_map array. Unlike exiting method looking up free page
> > list, this is done in constant space.
> > 
> > I intend this patch set to be merged with Kumagai-san's cyclic patch
> > set, so I mark these with RFC. See TODO below. Also, I have yet to
> > test the logic for old kernels from v2.6.15 to v2.6.17.
> > 
> > This new free page filtering needs the following values.
> > 
> >   - OFFSET(page._mapcount)
> >   - OFFSET(page.private)
> >   - SIZE(pageflags)
> >   - NUMBER(PG_buddy)
> >   - NUMBER(PG_slab)
> >   - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
> > 
> > Unfortunately, OFFSET(_mapcount) and OFFSET(private) fields of page
> > structure cannot be obtained from VMLINUX using the exiting library in
> > makedumpfile since two members are anonymous components of union
> > types. We need a new interface for them.
> > 
> > To try to use this patch set, it's handy to pass manually editted
> > VMCOREINFO file via -i option.
> > 
> > TODO
> > 
> >   1. Add new values in VMCOREINFO on the upstream kernel.
> > 
> >   2. Decide when to use this logic instead of the existing free list
> >   logic. Option is 1) introduce new dump level or 2) use it
> >   automatically if --cyclic is specified. This patch chooses 1) only
> >   for RFC use.
> > 
> >   3. Consider how to deal with old kernels on which we cannot add the
> >   values in VMCOREINFO. Options is 1) to force users to use VMLINUX,
> >   2) to cover them full hard coding or 3) to give up support on full
> >   range of kernel versions ever.
> 
> Thank you always for your work.
> 
> I will review your patches and measure executing time with v2 patch
> of cyclic processing. If your patches are effective, then I will 
> consider TODO above.
> Please wait for a while.

I did performance measuring with v2 patches of cyclic processing and HATAYAMA-san's patches.
And I fixed v2 patches to reduce wasteful process, please see the end of this mail.


How to measure:

  - The source data is a vmcore saved on the disk, the size is 5,099,292,912 bytes.
  - makedumpfile writes dumpfile to the same disk as the source data.
  - I measured the execution time with time(1) and adopted the average of 5 times.

Test Cases:

  - _mapcount:
    This logic is implemented by HATAYAMA-san.
    This logic looks up members of page structure instead of free_list 
    to filter out free pages.

  - free_list
    v2 patches choose this logic.
    This logic looks up whole free_list to filter out free pages every cycle.

  - upstream (v1.4.4):
    This logic is NOT CYCLIC, uses temporary bitmap file as usual.

Example:

  - _mapcount:
    $ time makedumpfile --cyclic -d32  -i vmcoreinfo vmcore dumpfile.d32

  - free_list:
    $ time makedumpfile --cyclic -d16  -i vmcoreinfo vmcore dumpfile.d16

  - upstream:
    $ time makedumpfile -d16  -i vmcoreinfo vmcore dumpfile.d16

Result:

  a) exclude only free pages

     BUFSIZE_CYCLIC  |                |                  execution time [sec]
         [byte]      |  num of cycle  | _mapcount (-d32) | free_list (-d16) | upstream (-d16)
   ------------------+----------------+------------------+------------------+-----------------
       1024          |      152       |      20.5204     |      28.8028     |        -
       1024 * 10     |       16       |      14.7460     |      18.7904     |        -
       1024 * 100    |        2       |      14.3962     |      17.9356     |        -
       1024 * 200    |        1       |      14.3166     |      17.8762     |     17.7928

  b) exclude all unnecessary pages

     BUFSIZE_CYCLIC  |                |                  execution time [sec]
         [byte]      |  num of cycle  | _mapcount (-d47) | free_list (-d31) | upstream (-d31)
   ------------------+----------------+------------------+------------------+-----------------
       1024          |      152       |      11.5086     |      27.2906     |        -
       1024 * 10     |       16       |       6.0740     |      10.9998     |        -
       1024 * 100    |        2       |       5.7928     |       9.1534     |        -
       1024 * 200    |        1       |       5.6378     |       8.9924     |      5.0516


I expected that the difference of execution time increases based on number of cycle,
because I think that repeating scanning free_list is high cost.
And according to result, it seems right.

_mapcount logic can be expected good performance in almost case when --cyclic is specified.
So, I think that making effort to resolve TODO is worth for us.

However, I think more consideration is needed to decide whether to choose _mapcount logic
or not when --cyclic isn't specified.


> TODO
> 
>   1. Add new values in VMCOREINFO on the upstream kernel.

I will send this request to the upstream kernel.

>   2. Decide when to use this logic instead of the existing free list
>   logic. Option is 1) introduce new dump level or 2) use it
>   automatically if --cyclic is specified. This patch chooses 1) only
>   for RFC use.

I think 2) is reasonable from the result.

>   3. Consider how to deal with old kernels on which we cannot add the
>   values in VMCOREINFO. Options is 1) to force users to use VMLINUX,
>   2) to cover them full hard coding or 3) to give up support on full
>   range of kernel versions ever.

I don't want to choose 2), I think it's inefficient.
I want to require that users prepare VMCOREINFO file with -g option from vmlinux,
if cyclic processing is needed.


Do you have any comments ?


Thanks
Atsushi Kumagai

diff --git a/makedumpfile.c b/makedumpfile.c
index 0e4660f..981d72a 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3824,6 +3824,9 @@ __exclude_unnecessary_pages(unsigned long mem_map,
 
 	for (pfn = pfn_start; pfn < pfn_end; pfn++, mem_map += SIZE(page)) {
 
+		if (info->flag_cyclic && !is_cyclic_region(pfn))
+			continue;
+
 		/*
 		 * Exclude the memory hole.
 		 */
@@ -3960,17 +3963,25 @@ exclude_unnecessary_pages_cyclic(void)
 		if (!exclude_free_page())
 			return FALSE;
 
-	for (mm = 0; mm < info->num_mem_map; mm++) {
+	/*
+	 * Exclude cache pages, cache private pages, user data pages, and free pages.
+	 */
+	if (info->dump_level & DL_EXCLUDE_CACHE ||
+	    info->dump_level & DL_EXCLUDE_CACHE_PRI ||
+	    info->dump_level & DL_EXCLUDE_USER_DATA ||
+	    info->dump_level & DL_EXCLUDE_FREE_CONST) {
+		for (mm = 0; mm < info->num_mem_map; mm++) {
 
-		mmd = &info->mem_map_data[mm];
+			mmd = &info->mem_map_data[mm];
 
-		if (mmd->mem_map == NOT_MEMMAP_ADDR)
-			continue;
+			if (mmd->mem_map == NOT_MEMMAP_ADDR)
+				continue;
 
-		if (mmd->pfn_end >= info->cyclic_start_pfn || mmd->pfn_start <= info->cyclic_end_pfn) {
-			if (!__exclude_unnecessary_pages(mmd->mem_map,
-							 mmd->pfn_start, mmd->pfn_end))
-				return FALSE;
+			if (mmd->pfn_end >= info->cyclic_start_pfn || mmd->pfn_start <= info->cyclic_end_pfn) {
+				if (!__exclude_unnecessary_pages(mmd->mem_map,
+								 mmd->pfn_start, mmd->pfn_end))
+					return FALSE;
+			}
 		}
 	}

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-07-13  5:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-28 17:37 [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 02/10] Add debuginfo interface for enum type size HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 03/10] Add new parameters for various tables HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 05/10] Add page flag values as hardcoded values HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 06/10] Add command-line processing for free page filtering looking up mem_map array HATAYAMA Daisuke
2012-06-28 17:38 ` [RFC PATCH 07/10] Add excldue free pages by " HATAYAMA Daisuke
2012-06-28 17:39 ` [RFC PATCH 08/10] Add page_is_buddy for recent kernels HATAYAMA Daisuke
2012-06-28 17:39 ` [RFC PATCH 09/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
2012-06-28 17:39 ` [RFC PATCH 10/10] Add page_is_buddy for old kernels HATAYAMA Daisuke
2012-06-29  3:07 ` [RFC PATCH 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
2012-06-29  6:23 ` Atsushi Kumagai
2012-07-13  5:23   ` Atsushi Kumagai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.