* [PATCH v2 00/10] Support free page filtering looking up mem_map array
@ 2012-11-16 5:01 HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
` (10 more replies)
0 siblings, 11 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:01 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
This patch set implements filtering free pages looking up mem_map
array instead of free lists. This is compatible for cyclic mode
because looking up mem_map can be divided into cycles. On the other
hand, dividing free pages contaiend in free lists is difficult since
they are not sorted in physical-address order.
* Changes
v1 => v2:
- If debuginfo is not available enough, switch logic to freelist
one. On v1, free page filtering was disabled in this case.
- Add hard-coded values in wider kernel versions.
- If some free pages possibly fail to be filtered, try to correct
cyclic buffer size appropreately.
- Correct the comment explaining the cyclic buffer overrun, which
was broken on v1.
RFC => v1:
- Logic is automatically selected at runtime according to the
current mode. In cyclic mode, mem_map array logic is used. In
non-cyclic mode, free list logic is used.
- The RFC version is:
http://lists.infradead.org/pipermail/kexec/2012-June/006441.html
* TODO
Add the following values in VMCOREINFO on the upstream kernel. These
are used in the mem_map logic.
- OFFSET(page._mapcount)
- OFFSET(page.private)
- SIZE(pageflags)
- NUMBER(PG_buddy)
- NUMBER(PG_slab)
- NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
* Test
I tested this patch set on the following kernel versions.
- 3.4
- 3.1
- 2.6.38
- 2.6.32
- 2.6.18
On the test, I manually specified VMCOREINFO while extending it with
the following values according to the kernel versions.
- 3.1, 3.4
NUMBER(PG_slab)=7
SIZE(pageflags)=4
OFFSET(page._mapcount)=24
OFFSET(page.private)=48
NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-128
- 2.6.38
SIZE(pageflags)=4
OFFSET(page._mapcount)=12
OFFSET(page.private)=16
NUMBER(PG_slab)=7
NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-2
- 2.6.32
NUMBER(PG_slab)=7
NUMBER(PG_buddy)=19
OFFSET(page._mapcount)=12
OFFSET(page.private)=16
SIZE(pageflags)=4
- 2.6.18
NUMBER(PG_slab)=7
NUMBER(PG_buddy)=19
OFFSET(page._mapcount)=12
OFFSET(page.private)=16
---
HATAYAMA Daisuke (10):
Warn cyclic buffer overrun and correct it if possible
Add page_is_buddy for old kernels
Add page_is_buddy for PG_buddy
Add page_is_buddy for recent kernels
Exclude free pages by looking up mem_map array
Add hardcoded page flag values
Add debuginfo-related processing for VMCOREINFO/VMLINUX
Add new parameters to various tables
Add debuginfo interface for enum type size
Move page flags setup for old kernels after debuginfo initialization
dwarf_info.c | 18 ++++
dwarf_info.h | 1
makedumpfile.c | 226 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
makedumpfile.h | 35 ++++++++-
4 files changed, 269 insertions(+), 11 deletions(-)
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 01/10] Move page flags setup for old kernels after debuginfo initialization
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
@ 2012-11-16 5:01 ` HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 02/10] Add debuginfo interface for enum type size HATAYAMA Daisuke
` (9 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:01 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
Page flags setup needs to be done after debuginfo initialization. We
use hard coded values only when debugging information doesn't provide
corresponding values.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 1183330..6513059 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -2723,8 +2723,6 @@ initial(void)
debug_info = TRUE;
}
- if (!get_value_for_old_linux())
- return FALSE;
out:
if (!info->page_size) {
/*
@@ -2789,6 +2787,9 @@ out:
if (is_xen_memory() && !get_dom0_mapnr())
return FALSE;
+ if (!get_value_for_old_linux())
+ return FALSE;
+
return TRUE;
}
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 02/10] Add debuginfo interface for enum type size
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
@ 2012-11-16 5:01 ` HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 03/10] Add new parameters to various tables HATAYAMA Daisuke
` (8 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:01 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
This is need to determine if a specified enumeration type is present
in a given debug information.
In this patch set, I'll use this to check if enum pageflags is
present.
The newly introduced interface is a simple extension from the existing
one for enumeration value.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
dwarf_info.c | 18 +++++++++++++++++-
dwarf_info.h | 1 +
makedumpfile.h | 13 +++++++++++--
3 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/dwarf_info.c b/dwarf_info.c
index 94b920a..09a8e1e 100644
--- a/dwarf_info.c
+++ b/dwarf_info.c
@@ -607,7 +607,7 @@ search_structure(Dwarf_Die *die, int *found)
static void
search_number(Dwarf_Die *die, int *found)
{
- int tag;
+ int tag, bytesize;
Dwarf_Word const_value;
Dwarf_Attribute attr;
Dwarf_Die child, *walker;
@@ -618,6 +618,22 @@ search_number(Dwarf_Die *die, int *found)
if (tag != DW_TAG_enumeration_type)
continue;
+ if (dwarf_info.cmd == DWARF_INFO_GET_ENUMERATION_TYPE_SIZE) {
+ name = dwarf_diename(die);
+
+ if (!name || strcmp(name, dwarf_info.struct_name))
+ continue;
+
+ if ((bytesize = dwarf_bytesize(die)) <= 0)
+ continue;
+
+ *found = TRUE;
+
+ dwarf_info.struct_size = bytesize;
+
+ return;
+ }
+
if (dwarf_child(die, &child) != 0)
continue;
diff --git a/dwarf_info.h b/dwarf_info.h
index 8d0084d..185cbb6 100644
--- a/dwarf_info.h
+++ b/dwarf_info.h
@@ -46,6 +46,7 @@ enum {
DWARF_INFO_CHECK_SYMBOL_ARRAY_TYPE,
DWARF_INFO_GET_SYMBOL_TYPE,
DWARF_INFO_GET_MEMBER_TYPE,
+ DWARF_INFO_GET_ENUMERATION_TYPE_SIZE,
};
char *get_dwarf_module_name(void);
diff --git a/makedumpfile.h b/makedumpfile.h
index 97aca2a..20f4d99 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -256,14 +256,23 @@ do { \
#define ARRAY_LENGTH(X) (array_table.X)
#define SIZE_INIT(X, Y) \
do { \
- if ((SIZE(X) = get_structure_size(Y, 0)) == FAILED_DWARFINFO) \
+ if ((SIZE(X) = get_structure_size(Y, DWARF_INFO_GET_STRUCT_SIZE)) \
+ == FAILED_DWARFINFO) \
return FALSE; \
} while (0)
#define TYPEDEF_SIZE_INIT(X, Y) \
do { \
- if ((SIZE(X) = get_structure_size(Y, 1)) == FAILED_DWARFINFO) \
+ if ((SIZE(X) = get_structure_size(Y, DWARF_INFO_GET_TYPEDEF_SIZE)) \
+ == FAILED_DWARFINFO) \
return FALSE; \
} while (0)
+#define ENUM_TYPE_SIZE_INIT(X, Y) \
+do { \
+ if ((SIZE(X) = get_structure_size(Y, \
+ DWARF_INFO_GET_ENUMERATION_TYPE_SIZE)) \
+ == FAILED_DWARFINFO) \
+ return FALSE; \
+} while (0)
#define OFFSET_INIT(X, Y, Z) \
do { \
if ((OFFSET(X) = get_member_offset(Y, Z, DWARF_INFO_GET_MEMBER_OFFSET)) \
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 03/10] Add new parameters to various tables
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 02/10] Add debuginfo interface for enum type size HATAYAMA Daisuke
@ 2012-11-16 5:01 ` HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX HATAYAMA Daisuke
` (7 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:01 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
Add new parameters required for mem_map-based freepages filtering, to
various tables.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.h | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.h b/makedumpfile.h
index 20f4d99..1304df0 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1163,6 +1163,8 @@ struct size_table {
long cpumask_t;
long kexec_segment;
long elf64_hdr;
+
+ long pageflags;
};
struct offset_table {
@@ -1171,6 +1173,8 @@ struct offset_table {
long _count;
long mapping;
long lru;
+ long _mapcount;
+ long private;
} page;
struct mem_section {
long section_mem_map;
@@ -1332,6 +1336,10 @@ struct number_table {
long PG_lru;
long PG_private;
long PG_swapcache;
+ long PG_buddy;
+ long PG_slab;
+
+ long PAGE_BUDDY_MAPCOUNT_VALUE;
};
struct srcfile_table {
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (2 preceding siblings ...)
2012-11-16 5:01 ` [PATCH v2 03/10] Add new parameters to various tables HATAYAMA Daisuke
@ 2012-11-16 5:01 ` HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 05/10] Add hardcoded page flag values HATAYAMA Daisuke
` (6 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:01 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 21 ++++++++++++++++++++-
1 files changed, 20 insertions(+), 1 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 6513059..0d03716 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -914,6 +914,8 @@ get_structure_info(void)
OFFSET_INIT(page.flags, "page", "flags");
OFFSET_INIT(page._count, "page", "_count");
OFFSET_INIT(page.mapping, "page", "mapping");
+ OFFSET_INIT(page._mapcount, "page", "_mapcount");
+ OFFSET_INIT(page.private, "page", "private");
/*
* Some vmlinux(s) don't have debugging information about
@@ -1005,6 +1007,10 @@ get_structure_info(void)
ENUM_NUMBER_INIT(PG_lru, "PG_lru");
ENUM_NUMBER_INIT(PG_private, "PG_private");
ENUM_NUMBER_INIT(PG_swapcache, "PG_swapcache");
+ ENUM_NUMBER_INIT(PG_buddy, "PG_buddy");
+ ENUM_NUMBER_INIT(PG_slab, "PG_slab");
+
+ ENUM_TYPE_SIZE_INIT(pageflags, "pageflags");
TYPEDEF_SIZE_INIT(nodemask_t, "nodemask_t");
@@ -1355,6 +1361,7 @@ write_vmcoreinfo_data(void)
WRITE_STRUCTURE_SIZE("list_head", list_head);
WRITE_STRUCTURE_SIZE("node_memblk_s", node_memblk_s);
WRITE_STRUCTURE_SIZE("nodemask_t", nodemask_t);
+ WRITE_STRUCTURE_SIZE("pageflags", pageflags);
/*
* write the member offset of 1st kernel
@@ -1363,6 +1370,8 @@ write_vmcoreinfo_data(void)
WRITE_MEMBER_OFFSET("page._count", page._count);
WRITE_MEMBER_OFFSET("page.mapping", page.mapping);
WRITE_MEMBER_OFFSET("page.lru", page.lru);
+ WRITE_MEMBER_OFFSET("page._mapcount", page._mapcount);
+ WRITE_MEMBER_OFFSET("page.private", page.private);
WRITE_MEMBER_OFFSET("mem_section.section_mem_map",
mem_section.section_mem_map);
WRITE_MEMBER_OFFSET("pglist_data.node_zones", pglist_data.node_zones);
@@ -1407,6 +1416,10 @@ write_vmcoreinfo_data(void)
WRITE_NUMBER("PG_lru", PG_lru);
WRITE_NUMBER("PG_private", PG_private);
WRITE_NUMBER("PG_swapcache", PG_swapcache);
+ WRITE_NUMBER("PG_buddy", PG_buddy);
+ WRITE_NUMBER("PG_slab", PG_slab);
+
+ WRITE_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
/*
* write the source file of 1st kernel
@@ -1654,11 +1667,14 @@ read_vmcoreinfo(void)
READ_STRUCTURE_SIZE("list_head", list_head);
READ_STRUCTURE_SIZE("node_memblk_s", node_memblk_s);
READ_STRUCTURE_SIZE("nodemask_t", nodemask_t);
+ READ_STRUCTURE_SIZE("pageflags", pageflags);
READ_MEMBER_OFFSET("page.flags", page.flags);
READ_MEMBER_OFFSET("page._count", page._count);
READ_MEMBER_OFFSET("page.mapping", page.mapping);
READ_MEMBER_OFFSET("page.lru", page.lru);
+ READ_MEMBER_OFFSET("page._mapcount", page._mapcount);
+ READ_MEMBER_OFFSET("page.private", page.private);
READ_MEMBER_OFFSET("mem_section.section_mem_map",
mem_section.section_mem_map);
READ_MEMBER_OFFSET("pglist_data.node_zones", pglist_data.node_zones);
@@ -1695,9 +1711,12 @@ read_vmcoreinfo(void)
READ_NUMBER("PG_lru", PG_lru);
READ_NUMBER("PG_private", PG_private);
READ_NUMBER("PG_swapcache", PG_swapcache);
-
+ READ_NUMBER("PG_slab", PG_slab);
+ READ_NUMBER("PG_buddy", PG_buddy);
READ_SRCFILE("pud_t", pud_t);
+ READ_NUMBER("PAGE_BUDDY_MAPCOUNT_VALUE", PAGE_BUDDY_MAPCOUNT_VALUE);
+
return TRUE;
}
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 05/10] Add hardcoded page flag values
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (3 preceding siblings ...)
2012-11-16 5:01 ` [PATCH v2 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX HATAYAMA Daisuke
@ 2012-11-16 5:01 ` HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 06/10] Exclude free pages by looking up mem_map array HATAYAMA Daisuke
` (5 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:01 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
Although these values should basically be exported from VMCOREINFO, we
cannot modify old kernels. Luckily, these values had not so frequently
been changed that it's relatively easy to determine appropreate values
for a given kernel version without any symbol or type information.
On the other hand, the mem_map logic also needs the values for some
members of page structure. But it much depends on kernel versions. We
aovid to hard code the values.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 32 ++++++++++++++++++++++++++++++++
makedumpfile.h | 9 +++++++++
2 files changed, 41 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 0d03716..1a0151c 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1210,6 +1210,38 @@ get_value_for_old_linux(void)
NUMBER(PG_private) = PG_private_ORIGINAL;
if (NUMBER(PG_swapcache) == NOT_FOUND_NUMBER)
NUMBER(PG_swapcache) = PG_swapcache_ORIGINAL;
+ if (NUMBER(PG_slab) == NOT_FOUND_NUMBER)
+ NUMBER(PG_slab) = PG_slab_ORIGINAL;
+ /*
+ * The values from here are for free page filtering based on
+ * mem_map array. These are minimum effort to cover old
+ * kernels.
+ *
+ * The logic also needs offset values for some members of page
+ * structure. But it much depends on kernel versions. We avoid
+ * to hard code the values.
+ */
+ if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER) {
+ if (info->kernel_version >= KERNEL_VERSION(2, 6, 17)
+ && info->kernel_version <= KERNEL_VERSION(2, 6, 26))
+ NUMBER(PG_buddy) = PG_buddy_v2_6_17_to_v2_6_26;
+ if (info->kernel_version >= KERNEL_VERSION(2, 6, 27)
+ && info->kernel_version <= KERNEL_VERSION(2, 6, 37))
+ NUMBER(PG_buddy) = PG_buddy_v2_6_27_to_v2_6_37;
+ }
+ if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) == NOT_FOUND_NUMBER) {
+ if (info->kernel_version == KERNEL_VERSION(2, 6, 38))
+ NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) =
+ PAGE_BUDDY_MAPCOUNT_VALUE_v2_6_38;
+ if (info->kernel_version >= KERNEL_VERSION(2, 6, 39))
+ NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) =
+ PAGE_BUDDY_MAPCOUNT_VALUE_v2_6_39_to_latest_version;
+ }
+ if (SIZE(pageflags) == NOT_FOUND_STRUCTURE) {
+ if (info->kernel_version >= KERNEL_VERSION(2, 6, 27))
+ SIZE(pageflags) =
+ PAGE_FLAGS_SIZE_v2_6_27_to_latest_version;
+ }
return TRUE;
}
diff --git a/makedumpfile.h b/makedumpfile.h
index 1304df0..d69bcca 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -71,9 +71,18 @@ int get_mem_type(void);
* The following values are for linux-2.6.25 or former.
*/
#define PG_lru_ORIGINAL (5)
+#define PG_slab_ORIGINAL (7)
#define PG_private_ORIGINAL (11) /* Has something at ->private */
#define PG_swapcache_ORIGINAL (15) /* Swap page: swp_entry_t in private */
+#define PG_buddy_v2_6_17_to_v2_6_26 (19)
+#define PG_buddy_v2_6_27_to_v2_6_37 (18)
+
+#define PAGE_BUDDY_MAPCOUNT_VALUE_v2_6_38 (-2)
+#define PAGE_BUDDY_MAPCOUNT_VALUE_v2_6_39_to_latest_version (-128)
+
+#define PAGE_FLAGS_SIZE_v2_6_27_to_latest_version (4)
+
#define PAGE_MAPPING_ANON (1)
#define LSEEKED_BITMAP (1)
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 06/10] Exclude free pages by looking up mem_map array
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (4 preceding siblings ...)
2012-11-16 5:01 ` [PATCH v2 05/10] Add hardcoded page flag values HATAYAMA Daisuke
@ 2012-11-16 5:02 ` HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 07/10] Add page_is_buddy for recent kernels HATAYAMA Daisuke
` (4 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:02 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
Add free page filtering logic in __exclude_unnecessary_pages.
Unlike other filtering levels, the number of free pages indicated by
buddy page is multiple. So I put this logic in the first position, in
front of page cache, to avoid filtering the pages that has already
been filtered as free pages.
Basically, this mem_map array logic is in cyclic mode only. The
exceptional case is that debug information necessary for mem_map logic
is not available enough. Then, it is switched to freelist logic.
In non cyclic mode, existing freelist logic is used.
Newly introduced page_is_buddy handler abstracts condition of buddy
page that varies depending on kernel versions. On the kernel versions
supported by makedumpfile, there are three kinds of buddy
conditions. Later patches will introduce them in order.
If failing to choose a correct page_is_buddy handler due to absence of
debug information, we switch the logic to freelist logic.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 45 ++++++++++++++++++++++++++++++++++++++++-----
makedumpfile.h | 5 +++++
2 files changed, 45 insertions(+), 5 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 1a0151c..0e44a8b 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -58,6 +58,8 @@ do { \
*ptr_long_table = value; \
} while (0)
+static void setup_page_is_buddy(void);
+
void
initialize_tables(void)
{
@@ -2841,6 +2843,9 @@ out:
if (!get_value_for_old_linux())
return FALSE;
+ if (info->flag_cyclic && (info->dump_level & DL_EXCLUDE_FREE))
+ setup_page_is_buddy();
+
return TRUE;
}
@@ -3663,6 +3668,18 @@ exclude_free_page(void)
return TRUE;
}
+static void
+setup_page_is_buddy(void)
+{
+ if (OFFSET(page.private) == NOT_FOUND_STRUCTURE)
+ goto out;
+
+out:
+ if (!info->page_is_buddy)
+ DEBUG_MSG("Can't select page_is_buddy handler; "
+ "follow free lists instead of mem_map array.\n");
+}
+
/*
* If using a dumpfile in kdump-compressed format as a source file
* instead of /proc/vmcore, 1st-bitmap of a new dumpfile must be
@@ -3873,8 +3890,8 @@ __exclude_unnecessary_pages(unsigned long mem_map,
unsigned long long pfn_read_start, pfn_read_end, index_pg;
unsigned char page_cache[SIZE(page) * PGMM_CACHED];
unsigned char *pcache;
- unsigned int _count;
- unsigned long flags, mapping;
+ unsigned int _count, _mapcount = 0;
+ unsigned long flags, mapping, private = 0;
/*
* Refresh the buffer of struct page, when changing mem_map.
@@ -3928,11 +3945,28 @@ __exclude_unnecessary_pages(unsigned long mem_map,
flags = ULONG(pcache + OFFSET(page.flags));
_count = UINT(pcache + OFFSET(page._count));
mapping = ULONG(pcache + OFFSET(page.mapping));
+ if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
+ _mapcount = UINT(pcache + OFFSET(page._mapcount));
+ if (OFFSET(page.private) != NOT_FOUND_STRUCTURE)
+ private = ULONG(pcache + OFFSET(page.private));
/*
+ * Exclude the free page managed by a buddy
+ */
+ if ((info->dump_level & DL_EXCLUDE_FREE)
+ && info->flag_cyclic
+ && info->page_is_buddy
+ && info->page_is_buddy(flags, _mapcount, private, _count)) {
+ int i;
+
+ for (i = 0; i < (1 << private); ++i)
+ clear_bit_on_2nd_bitmap_for_kernel(pfn + i);
+ pfn_free += i;
+ }
+ /*
* Exclude the cache page without the private page.
*/
- if ((info->dump_level & DL_EXCLUDE_CACHE)
+ else if ((info->dump_level & DL_EXCLUDE_CACHE)
&& (isLRU(flags) || isSwapCache(flags))
&& !isPrivate(flags) && !isAnon(mapping)) {
if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
@@ -4013,7 +4047,7 @@ exclude_unnecessary_pages_cyclic(void)
*/
copy_bitmap_cyclic();
- if (info->dump_level & DL_EXCLUDE_FREE)
+ if ((info->dump_level & DL_EXCLUDE_FREE) && !info->page_is_buddy)
if (!exclude_free_page())
return FALSE;
@@ -4022,7 +4056,8 @@ exclude_unnecessary_pages_cyclic(void)
*/
if (info->dump_level & DL_EXCLUDE_CACHE ||
info->dump_level & DL_EXCLUDE_CACHE_PRI ||
- info->dump_level & DL_EXCLUDE_USER_DATA) {
+ info->dump_level & DL_EXCLUDE_USER_DATA ||
+ ((info->dump_level & DL_EXCLUDE_FREE) && info->page_is_buddy)) {
gettimeofday(&tv_start, NULL);
diff --git a/makedumpfile.h b/makedumpfile.h
index d69bcca..c236ece 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1037,6 +1037,11 @@ struct DumpInfo {
*/
int flag_sadump_diskset;
enum sadump_format_type flag_sadump; /* sadump format type */
+ /*
+ * for filtering free pages managed by buddy system:
+ */
+ int (*page_is_buddy)(unsigned long flags, unsigned int _mapcount,
+ unsigned long private, unsigned int _count);
};
extern struct DumpInfo *info;
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 07/10] Add page_is_buddy for recent kernels
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (5 preceding siblings ...)
2012-11-16 5:02 ` [PATCH v2 06/10] Exclude free pages by looking up mem_map array HATAYAMA Daisuke
@ 2012-11-16 5:02 ` HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 08/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
` (3 subsequent siblings)
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:02 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
On kernels from v2.6.38 and later kernels, buddy page is marked by
_mapcount == PAGE_BUDDY_MAPCOUNT_VALUE, which varies once as follows:
kernel version | PAGE_BUDDY_MAPCOUNT_VALUE
------------------+--------------------------
v2.6.38 | -2
v2.6.39 and later | -128
One more notice is that _mapcount shares its memory with other fields
for SLAB/SLUB when PG_slab is set. Before looking up _mapcount value,
we need to check if PG_slab is set.
Since this page_is_buddy needs _mapcount, we use freelist logic if
_mapcount is not available.
Recall that PG_slab has been 7 since v2.6.15. No need to check it.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 21 +++++++++++++++++++++
1 files changed, 21 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 0e44a8b..7b13dca 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3668,12 +3668,33 @@ exclude_free_page(void)
return TRUE;
}
+/*
+ * For v2.6.38 and later kernel versions.
+ */
+static int
+page_is_buddy_v3(unsigned long flags, unsigned int _mapcount,
+ unsigned long private, unsigned int _count)
+{
+ if (flags & (1UL << NUMBER(PG_slab)))
+ return FALSE;
+
+ if (_mapcount == (int)NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE))
+ return TRUE;
+
+ return FALSE;
+}
+
static void
setup_page_is_buddy(void)
{
if (OFFSET(page.private) == NOT_FOUND_STRUCTURE)
goto out;
+ if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
+ if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
+ info->page_is_buddy = page_is_buddy_v3;
+ }
+
out:
if (!info->page_is_buddy)
DEBUG_MSG("Can't select page_is_buddy handler; "
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 08/10] Add page_is_buddy for PG_buddy
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (6 preceding siblings ...)
2012-11-16 5:02 ` [PATCH v2 07/10] Add page_is_buddy for recent kernels HATAYAMA Daisuke
@ 2012-11-16 5:02 ` HATAYAMA Daisuke
2012-11-27 6:00 ` Atsushi Kumagai
2012-11-16 5:02 ` [PATCH v2 09/10] Add page_is_buddy for old kernels HATAYAMA Daisuke
` (2 subsequent siblings)
10 siblings, 1 reply; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:02 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
On kernels from v2.6.18 to v2.6.37, buddy page is marked by the
PG_buddy flag.
kernel version | PG_buddy
------------------ +---------------------------------
v2.6.17 to v2.6.26 | 19
v2.6.27 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
| 18 otherwise
We don't need to care about CONFIG_PAGEFLAGS_EXTEND because the
architectures specifying this as y are um and xtensa only. They are
not included in the supported architectures of makedumpfile.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 24 ++++++++++++++++++++----
1 files changed, 20 insertions(+), 4 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 7b13dca..4e5d4d3 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3669,6 +3669,19 @@ exclude_free_page(void)
}
/*
+ * For the kernel versions from v2.6.17 to v2.6.37.
+ */
+static int
+page_is_buddy_v2(unsigned long flags, unsigned int _mapcount,
+ unsigned long private, unsigned int _count)
+{
+ if (flags & (1UL << NUMBER(PG_buddy)))
+ return TRUE;
+
+ return FALSE;
+}
+
+/*
* For v2.6.38 and later kernel versions.
*/
static int
@@ -3690,10 +3703,13 @@ setup_page_is_buddy(void)
if (OFFSET(page.private) == NOT_FOUND_STRUCTURE)
goto out;
- if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
- if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
- info->page_is_buddy = page_is_buddy_v3;
- }
+ if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER) {
+ if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
+ if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
+ info->page_is_buddy = page_is_buddy_v3;
+ }
+ } else
+ info->page_is_buddy = page_is_buddy_v2;
out:
if (!info->page_is_buddy)
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 09/10] Add page_is_buddy for old kernels
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (7 preceding siblings ...)
2012-11-16 5:02 ` [PATCH v2 08/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
@ 2012-11-16 5:02 ` HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible HATAYAMA Daisuke
2012-11-16 7:05 ` [PATCH v2 00/10] Support free page filtering looking up mem_map array Atsushi Kumagai
10 siblings, 0 replies; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:02 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
On kernels from v2.6.15 to v2.6.17, buddy page is marked by the
condition that PG_private flag is set and _count == 0.
Since this page_is_buddy needs _count, we use freelist logic if _count
is not available.
Unfortunately, I have yet to test this logic on these kernel versions
simply because I've been failing to boot them on my box.
Note that on these kernels, free list can be corrupted due to the bug
that the above two conditions are not checked atomically. The reason
why PG_buddy was introduced is a fix for this bug.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 17 +++++++++++++++++
1 files changed, 17 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 4e5d4d3..65b9fd7 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3669,6 +3669,20 @@ exclude_free_page(void)
}
/*
+ * For the kernel versions from v2.6.15 to v2.6.17.
+ */
+static int
+page_is_buddy_v1(unsigned long flags, unsigned int _mapcount,
+ unsigned long private, unsigned int _count)
+{
+ if ((flags & (1UL << NUMBER(PG_private)))
+ && _count == 0)
+ return TRUE;
+
+ return FALSE;
+}
+
+/*
* For the kernel versions from v2.6.17 to v2.6.37.
*/
static int
@@ -3707,6 +3721,9 @@ setup_page_is_buddy(void)
if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
info->page_is_buddy = page_is_buddy_v3;
+ } else if (SIZE(pageflags) == NOT_FOUND_STRUCTURE) {
+ if (OFFSET(page._count) != NOT_FOUND_STRUCTURE)
+ info->page_is_buddy = page_is_buddy_v1;
}
} else
info->page_is_buddy = page_is_buddy_v2;
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (8 preceding siblings ...)
2012-11-16 5:02 ` [PATCH v2 09/10] Add page_is_buddy for old kernels HATAYAMA Daisuke
@ 2012-11-16 5:02 ` HATAYAMA Daisuke
2013-09-11 7:51 ` Atsushi Kumagai
2012-11-16 7:05 ` [PATCH v2 00/10] Support free page filtering looking up mem_map array Atsushi Kumagai
10 siblings, 1 reply; 20+ messages in thread
From: HATAYAMA Daisuke @ 2012-11-16 5:02 UTC (permalink / raw)
To: kumagai-atsushi; +Cc: kexec
Clearling bits on cyclic buffer can overrun the cyclic buffer
according to some combination of MAX_ORDER and cyclic buffer size.
The cyclic buffer size is corrected if possible.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 70 insertions(+), 1 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 65b9fd7..18a1e0a 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -58,6 +58,7 @@ do { \
*ptr_long_table = value; \
} while (0)
+static void check_cyclic_buffer_overrun(void);
static void setup_page_is_buddy(void);
void
@@ -2832,6 +2833,9 @@ out:
!sadump_generate_elf_note_from_dumpfile())
return FALSE;
+ if (info->flag_cyclic && info->dump_level & DL_EXCLUDE_FREE)
+ check_cyclic_buffer_overrun();
+
} else {
if (!get_mem_map_without_mm())
return FALSE;
@@ -3669,6 +3673,61 @@ exclude_free_page(void)
}
/*
+ * Let C be a cyclic buffer size and B a bitmap size used for
+ * representing maximum block size managed by buddy allocator.
+ *
+ * For some combinations of C and B, clearing operation can overrun
+ * the cyclic buffer. Let's consider three cases.
+ *
+ * - If C == B, this is trivially safe.
+ *
+ * - If B > C, overrun can easily happen.
+ *
+ * - In case of C > B, if C mod B != 0, then there exist n > m > 0,
+ * B > b > 0 such that n x C = m x B + b. This means that clearing
+ * operation overruns cyclic buffer (B - b)-bytes in the
+ * combination of n-th cycle and m-th block.
+ *
+ * Note that C mod B != 0 iff (m x C) mod B != 0 for some m.
+ *
+ * If C == B, C mod B == 0 always holds. Again, if B > C, C mod B != 0
+ * always holds. Hence, it's always sufficient to check the condition
+ * C mod B != 0 in order to determine whether overrun can happen or
+ * not.
+ *
+ * The bitmap size used for maximum block size B is calculated from
+ * MAX_ORDER as:
+ *
+ * B := DIVIDE_UP((1 << (MAX_ORDER - 1)), BITS_PER_BYTE)
+ *
+ * Normally, MAX_ORDER is 11 at default. This is configurable through
+ * CONFIG_FORCE_MAX_ZONEORDER.
+ */
+static void
+check_cyclic_buffer_overrun(void)
+{
+ int max_order = ARRAY_LENGTH(zone.free_area);
+ int max_order_nr_pages = 1 << (max_order - 1);
+ unsigned long max_block_size = roundup(max_order_nr_pages, BITPERBYTE);
+
+ if (info->bufsize_cyclic %
+ roundup(max_order_nr_pages, BITPERBYTE)) {
+ unsigned long bufsize;
+
+ if (max_block_size > info->bufsize_cyclic) {
+ MSG("WARNING: some free pages are not filtered.\n");
+ return;
+ }
+
+ bufsize = info->bufsize_cyclic;
+ info->bufsize_cyclic = round(bufsize, max_block_size);
+
+ MSG("cyclic buffer size has been changed: %lu => %lu\n",
+ bufsize, info->bufsize_cyclic);
+ }
+}
+
+/*
* For the kernel versions from v2.6.15 to v2.6.17.
*/
static int
@@ -4013,8 +4072,18 @@ __exclude_unnecessary_pages(unsigned long mem_map,
&& info->page_is_buddy(flags, _mapcount, private, _count)) {
int i;
- for (i = 0; i < (1 << private); ++i)
+ for (i = 0; i < (1 << private); ++i) {
+ /*
+ * According to combination of
+ * MAX_ORDER and size of cyclic
+ * buffer, this clearing bit operation
+ * can overrun the cyclic buffer.
+ *
+ * See check_cyclic_buffer_overrun()
+ * for the detail.
+ */
clear_bit_on_2nd_bitmap_for_kernel(pfn + i);
+ }
pfn_free += i;
}
/*
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 00/10] Support free page filtering looking up mem_map array
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
` (9 preceding siblings ...)
2012-11-16 5:02 ` [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible HATAYAMA Daisuke
@ 2012-11-16 7:05 ` Atsushi Kumagai
10 siblings, 0 replies; 20+ messages in thread
From: Atsushi Kumagai @ 2012-11-16 7:05 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
Hello HATAYAMA-san,
On Fri, 16 Nov 2012 14:01:30 +0900
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> This patch set implements filtering free pages looking up mem_map
> array instead of free lists. This is compatible for cyclic mode
> because looking up mem_map can be divided into cycles. On the other
> hand, dividing free pages contaiend in free lists is difficult since
> they are not sorted in physical-address order.
Thank you very much for your hard work.
I will release v1.5.1-rc with this patch set soon.
Thanks
Atsushi Kumagai
>
> * Changes
>
> v1 => v2:
> - If debuginfo is not available enough, switch logic to freelist
> one. On v1, free page filtering was disabled in this case.
> - Add hard-coded values in wider kernel versions.
> - If some free pages possibly fail to be filtered, try to correct
> cyclic buffer size appropreately.
> - Correct the comment explaining the cyclic buffer overrun, which
> was broken on v1.
>
> RFC => v1:
> - Logic is automatically selected at runtime according to the
> current mode. In cyclic mode, mem_map array logic is used. In
> non-cyclic mode, free list logic is used.
> - The RFC version is:
> http://lists.infradead.org/pipermail/kexec/2012-June/006441.html
>
> * TODO
>
> Add the following values in VMCOREINFO on the upstream kernel. These
> are used in the mem_map logic.
>
> - OFFSET(page._mapcount)
> - OFFSET(page.private)
> - SIZE(pageflags)
> - NUMBER(PG_buddy)
> - NUMBER(PG_slab)
> - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)
>
> * Test
>
> I tested this patch set on the following kernel versions.
>
> - 3.4
> - 3.1
> - 2.6.38
> - 2.6.32
> - 2.6.18
>
> On the test, I manually specified VMCOREINFO while extending it with
> the following values according to the kernel versions.
>
> - 3.1, 3.4
> NUMBER(PG_slab)=7
> SIZE(pageflags)=4
> OFFSET(page._mapcount)=24
> OFFSET(page.private)=48
> NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-128
>
> - 2.6.38
> SIZE(pageflags)=4
> OFFSET(page._mapcount)=12
> OFFSET(page.private)=16
> NUMBER(PG_slab)=7
> NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE)=-2
>
> - 2.6.32
> NUMBER(PG_slab)=7
> NUMBER(PG_buddy)=19
> OFFSET(page._mapcount)=12
> OFFSET(page.private)=16
> SIZE(pageflags)=4
>
> - 2.6.18
> NUMBER(PG_slab)=7
> NUMBER(PG_buddy)=19
> OFFSET(page._mapcount)=12
> OFFSET(page.private)=16
>
> ---
>
> HATAYAMA Daisuke (10):
> Warn cyclic buffer overrun and correct it if possible
> Add page_is_buddy for old kernels
> Add page_is_buddy for PG_buddy
> Add page_is_buddy for recent kernels
> Exclude free pages by looking up mem_map array
> Add hardcoded page flag values
> Add debuginfo-related processing for VMCOREINFO/VMLINUX
> Add new parameters to various tables
> Add debuginfo interface for enum type size
> Move page flags setup for old kernels after debuginfo initialization
>
>
> dwarf_info.c | 18 ++++
> dwarf_info.h | 1
> makedumpfile.c | 226 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> makedumpfile.h | 35 ++++++++-
> 4 files changed, 269 insertions(+), 11 deletions(-)
>
> --
>
> Thanks.
> HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 08/10] Add page_is_buddy for PG_buddy
2012-11-16 5:02 ` [PATCH v2 08/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
@ 2012-11-27 6:00 ` Atsushi Kumagai
2012-11-27 7:30 ` Atsushi Kumagai
2012-11-27 8:53 ` Hatayama, Daisuke
0 siblings, 2 replies; 20+ messages in thread
From: Atsushi Kumagai @ 2012-11-27 6:00 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
Hello HATAYAMA-san,
On Fri, 16 Nov 2012 14:02:13 +0900
HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> On kernels from v2.6.18 to v2.6.37, buddy page is marked by the
> PG_buddy flag.
>
> kernel version | PG_buddy
> ------------------ +---------------------------------
> v2.6.17 to v2.6.26 | 19
> v2.6.27 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
> | 18 otherwise
>
> We don't need to care about CONFIG_PAGEFLAGS_EXTEND because the
> architectures specifying this as y are um and xtensa only. They are
> not included in the supported architectures of makedumpfile.
Sorry for too late point out.
I did regression test for v1.5.1 on x86_64 machine in the last weekend
and found the issue that page_is_buddy_v2() can't exclude free pages
correctly for kernel 2.6.30 to 2.6.37.
I checked the configuration of these kernels, I made sure that
CONFIG_PAGEFLAGS_EXTENDED was set as y, so PG_buddy=18 is invalid for them.
According to this fact, PG_buddy is variable for the supported architectures
of makedumpfile, too.
So, how did you confirm that CONFIG_PAGEFLAGS_EXTENDED is only related with
um and xtensa ?
And why did you test with PG_buddy=19 for 2.6.32 when you posted this patch set?
> - 2.6.32
> NUMBER(PG_slab)=7
> NUMBER(PG_buddy)=19
> OFFSET(page._mapcount)=12
> OFFSET(page.private)=16
> SIZE(pageflags)=4
Thanks
Atsushi Kumagai
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> ---
>
> makedumpfile.c | 24 ++++++++++++++++++++----
> 1 files changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 7b13dca..4e5d4d3 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -3669,6 +3669,19 @@ exclude_free_page(void)
> }
>
> /*
> + * For the kernel versions from v2.6.17 to v2.6.37.
> + */
> +static int
> +page_is_buddy_v2(unsigned long flags, unsigned int _mapcount,
> + unsigned long private, unsigned int _count)
> +{
> + if (flags & (1UL << NUMBER(PG_buddy)))
> + return TRUE;
> +
> + return FALSE;
> +}
> +
> +/*
> * For v2.6.38 and later kernel versions.
> */
> static int
> @@ -3690,10 +3703,13 @@ setup_page_is_buddy(void)
> if (OFFSET(page.private) == NOT_FOUND_STRUCTURE)
> goto out;
>
> - if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
> - if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
> - info->page_is_buddy = page_is_buddy_v3;
> - }
> + if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER) {
> + if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
> + if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
> + info->page_is_buddy = page_is_buddy_v3;
> + }
> + } else
> + info->page_is_buddy = page_is_buddy_v2;
>
> out:
> if (!info->page_is_buddy)
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 08/10] Add page_is_buddy for PG_buddy
2012-11-27 6:00 ` Atsushi Kumagai
@ 2012-11-27 7:30 ` Atsushi Kumagai
2012-11-27 8:53 ` Hatayama, Daisuke
1 sibling, 0 replies; 20+ messages in thread
From: Atsushi Kumagai @ 2012-11-27 7:30 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
On Tue, 27 Nov 2012 15:00:48 +0900
Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> wrote:
> Hello HATAYAMA-san,
>
> On Fri, 16 Nov 2012 14:02:13 +0900
> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
>
> > On kernels from v2.6.18 to v2.6.37, buddy page is marked by the
> > PG_buddy flag.
> >
> > kernel version | PG_buddy
> > ------------------ +---------------------------------
> > v2.6.17 to v2.6.26 | 19
> > v2.6.27 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
> > | 18 otherwise
Additionally, PG_private_2 was introduced in v2.6.30 with the commit
below:
commit 266cf658efcf6ac33541a46740f74f50c79d2b6b
Author: David Howells <dhowells@redhat.com>
Date: Fri Apr 3 16:42:36 2009 +0100
FS-Cache: Recruit a page flags for cache management
Recruit a page flag to aid in cache management. The following extra flag is
defined:
(1) PG_fscache (PG_private_2)
Therefore, PG_buddy varies as below:
kernel version | PG_buddy
------------------ +---------------------------------
v2.6.27 to v2.6.29 | 18 if CONFIG_PAGEFLAGS_EXTEND=y
| 17 otherwise
v2.6.30 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
| 18 otherwise
Thanks
Atsushi Kumagai
> > We don't need to care about CONFIG_PAGEFLAGS_EXTEND because the
> > architectures specifying this as y are um and xtensa only. They are
> > not included in the supported architectures of makedumpfile.
>
> Sorry for too late point out.
>
> I did regression test for v1.5.1 on x86_64 machine in the last weekend
> and found the issue that page_is_buddy_v2() can't exclude free pages
> correctly for kernel 2.6.30 to 2.6.37.
> I checked the configuration of these kernels, I made sure that
> CONFIG_PAGEFLAGS_EXTENDED was set as y, so PG_buddy=18 is invalid for them.
>
> According to this fact, PG_buddy is variable for the supported architectures
> of makedumpfile, too.
> So, how did you confirm that CONFIG_PAGEFLAGS_EXTENDED is only related with
> um and xtensa ?
>
> And why did you test with PG_buddy=19 for 2.6.32 when you posted this patch set?
>
> > - 2.6.32
> > NUMBER(PG_slab)=7
> > NUMBER(PG_buddy)=19
> > OFFSET(page._mapcount)=12
> > OFFSET(page.private)=16
> > SIZE(pageflags)=4
>
>
> Thanks
> Atsushi Kumagai
>
> > Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> > ---
> >
> > makedumpfile.c | 24 ++++++++++++++++++++----
> > 1 files changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/makedumpfile.c b/makedumpfile.c
> > index 7b13dca..4e5d4d3 100644
> > --- a/makedumpfile.c
> > +++ b/makedumpfile.c
> > @@ -3669,6 +3669,19 @@ exclude_free_page(void)
> > }
> >
> > /*
> > + * For the kernel versions from v2.6.17 to v2.6.37.
> > + */
> > +static int
> > +page_is_buddy_v2(unsigned long flags, unsigned int _mapcount,
> > + unsigned long private, unsigned int _count)
> > +{
> > + if (flags & (1UL << NUMBER(PG_buddy)))
> > + return TRUE;
> > +
> > + return FALSE;
> > +}
> > +
> > +/*
> > * For v2.6.38 and later kernel versions.
> > */
> > static int
> > @@ -3690,10 +3703,13 @@ setup_page_is_buddy(void)
> > if (OFFSET(page.private) == NOT_FOUND_STRUCTURE)
> > goto out;
> >
> > - if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
> > - if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
> > - info->page_is_buddy = page_is_buddy_v3;
> > - }
> > + if (NUMBER(PG_buddy) == NOT_FOUND_NUMBER) {
> > + if (NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) != NOT_FOUND_NUMBER) {
> > + if (OFFSET(page._mapcount) != NOT_FOUND_STRUCTURE)
> > + info->page_is_buddy = page_is_buddy_v3;
> > + }
> > + } else
> > + info->page_is_buddy = page_is_buddy_v2;
> >
> > out:
> > if (!info->page_is_buddy)
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 08/10] Add page_is_buddy for PG_buddy
2012-11-27 6:00 ` Atsushi Kumagai
2012-11-27 7:30 ` Atsushi Kumagai
@ 2012-11-27 8:53 ` Hatayama, Daisuke
2012-11-28 7:42 ` Atsushi Kumagai
1 sibling, 1 reply; 20+ messages in thread
From: Hatayama, Daisuke @ 2012-11-27 8:53 UTC (permalink / raw)
To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org
> From: kexec-bounces@lists.infradead.org
> [mailto:kexec-bounces@lists.infradead.org] On Behalf Of Atsushi Kumagai
> Sent: Tuesday, November 27, 2012 3:01 PM
> Subject: Re: [PATCH v2 08/10] Add page_is_buddy for PG_buddy
>
> Hello HATAYAMA-san,
>
> On Fri, 16 Nov 2012 14:02:13 +0900
> HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
>
> > On kernels from v2.6.18 to v2.6.37, buddy page is marked by the
> > PG_buddy flag.
> >
> > kernel version | PG_buddy
> > ------------------ +---------------------------------
> > v2.6.17 to v2.6.26 | 19
> > v2.6.27 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
> > | 18 otherwise
> >
> > We don't need to care about CONFIG_PAGEFLAGS_EXTEND because the
> > architectures specifying this as y are um and xtensa only. They are
> > not included in the supported architectures of makedumpfile.
>
> Sorry for too late point out.
>
> I did regression test for v1.5.1 on x86_64 machine in the last weekend
> and found the issue that page_is_buddy_v2() can't exclude free pages
> correctly for kernel 2.6.30 to 2.6.37.
> I checked the configuration of these kernels, I made sure that
> CONFIG_PAGEFLAGS_EXTENDED was set as y, so PG_buddy=18 is invalid for them.
>
> According to this fact, PG_buddy is variable for the supported architectures
> of makedumpfile, too.
> So, how did you confirm that CONFIG_PAGEFLAGS_EXTENDED is only related with
> um and xtensa ?
>
I guess I used git grep on different kernel source tree.
> And why did you test with PG_buddy=19 for 2.6.32 when you posted this patch
> set?
>
> > - 2.6.32
> > NUMBER(PG_slab)=7
> > NUMBER(PG_buddy)=19
> > OFFSET(page._mapcount)=12
> > OFFSET(page.private)=16
> > SIZE(pageflags)=4
>
Sorry, I was not aware of this.
Anyway, the current hard-coded values doesn't work at all since before talking
about PG_buddy, we cannot get values of page structure's members, such as offset
value of private member, so non-cyclic is always used instead.
How about dropping hard-coded values?
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 08/10] Add page_is_buddy for PG_buddy
2012-11-27 8:53 ` Hatayama, Daisuke
@ 2012-11-28 7:42 ` Atsushi Kumagai
0 siblings, 0 replies; 20+ messages in thread
From: Atsushi Kumagai @ 2012-11-28 7:42 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
Hello HATAYAMA-san,
On Tue, 27 Nov 2012 08:53:00 +0000
"Hatayama, Daisuke" <d.hatayama@jp.fujitsu.com> wrote:
> > From: kexec-bounces@lists.infradead.org
> > [mailto:kexec-bounces@lists.infradead.org] On Behalf Of Atsushi Kumagai
> > Sent: Tuesday, November 27, 2012 3:01 PM
> > Subject: Re: [PATCH v2 08/10] Add page_is_buddy for PG_buddy
> >
> > Hello HATAYAMA-san,
> >
> > On Fri, 16 Nov 2012 14:02:13 +0900
> > HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com> wrote:
> >
> > > On kernels from v2.6.18 to v2.6.37, buddy page is marked by the
> > > PG_buddy flag.
> > >
> > > kernel version | PG_buddy
> > > ------------------ +---------------------------------
> > > v2.6.17 to v2.6.26 | 19
> > > v2.6.27 to v2.6.37 | 19 if CONFIG_PAGEFLAGS_EXTEND=y
> > > | 18 otherwise
> > >
> > > We don't need to care about CONFIG_PAGEFLAGS_EXTEND because the
> > > architectures specifying this as y are um and xtensa only. They are
> > > not included in the supported architectures of makedumpfile.
> >
> > Sorry for too late point out.
> >
> > I did regression test for v1.5.1 on x86_64 machine in the last weekend
> > and found the issue that page_is_buddy_v2() can't exclude free pages
> > correctly for kernel 2.6.30 to 2.6.37.
> > I checked the configuration of these kernels, I made sure that
> > CONFIG_PAGEFLAGS_EXTENDED was set as y, so PG_buddy=18 is invalid for them.
> >
> > According to this fact, PG_buddy is variable for the supported architectures
> > of makedumpfile, too.
> > So, how did you confirm that CONFIG_PAGEFLAGS_EXTENDED is only related with
> > um and xtensa ?
> >
>
> I guess I used git grep on different kernel source tree.
>
> > And why did you test with PG_buddy=19 for 2.6.32 when you posted this patch
> > set?
> >
> > > - 2.6.32
> > > NUMBER(PG_slab)=7
> > > NUMBER(PG_buddy)=19
> > > OFFSET(page._mapcount)=12
> > > OFFSET(page.private)=16
> > > SIZE(pageflags)=4
> >
>
> Sorry, I was not aware of this.
OK, I see.
> Anyway, the current hard-coded values doesn't work at all since before talking
> about PG_buddy, we cannot get values of page structure's members, such as offset
> value of private member, so non-cyclic is always used instead.
>
> How about dropping hard-coded values?
It seems better to drop hard-coded values.
vmlinux (or vmcoreinfo) is necessary to get values of page structure's
members and we can get NUMBER(PG_buddy) from vmlinux as well.
So, I think the hard-coding for PG_buddy brings few benefits.
I misunderstood that enumeration values can't be gotten from vmlinux,
so I thought the hard-coding is reasonable.
(I seem to just faced the gcc bug below.)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41065
Anyway, to sum it up:
1. If vmlinux or vmcoreinfo is prepared and running on cyclic mode,
page_is_buddy_v2() works correctly for v2.6.18 to v2.6.37.
2. Otherwise, free list logic is used.
I think it's good policy for current kernels.
Thanks
Atsushi Kumagai
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible
2012-11-16 5:02 ` [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible HATAYAMA Daisuke
@ 2013-09-11 7:51 ` Atsushi Kumagai
2013-09-11 8:35 ` HATAYAMA Daisuke
0 siblings, 1 reply; 20+ messages in thread
From: Atsushi Kumagai @ 2013-09-11 7:51 UTC (permalink / raw)
To: d.hatayama; +Cc: kexec
Hello HATAYAMA-san,
(2012/11/16 14:02), HATAYAMA Daisuke wrote:
> Clearling bits on cyclic buffer can overrun the cyclic buffer
> according to some combination of MAX_ORDER and cyclic buffer size.
>
> The cyclic buffer size is corrected if possible.
>
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
I know it's so late, I found that updating pfn_cyclic is missing.
It can cause memory corruption.
From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Date: Wed, 11 Sep 2013 14:50:10 +0900
Subject: [PATCH] Update pfn_cyclic when the cyclic buffer size is
corrected.
When the clearing bit operation for excluding free pages can overrun
the cyclic buffer, the buffer size is changed with
check_cyclic_buffer_overrun().
Then pfn_cyclic should be recalculated.
Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
---
makedumpfile.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/makedumpfile.c b/makedumpfile.c
index 09c0d4a..164b3f1 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -4091,6 +4091,7 @@ check_cyclic_buffer_overrun(void)
bufsize = info->bufsize_cyclic;
info->bufsize_cyclic = round(bufsize, max_block_size);
+ info->pfn_cyclic = info->bufsize_cyclic * BITPERBYTE;
MSG("cyclic buffer size has been changed: %lu => %lu\n",
bufsize, info->bufsize_cyclic);
--
1.8.0.2
I'll merge this patch into v1.5.5.
Thanks
Atsushi Kumagai
> ---
>
> makedumpfile.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 70 insertions(+), 1 deletions(-)
>
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 65b9fd7..18a1e0a 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -58,6 +58,7 @@ do { \
> *ptr_long_table = value; \
> } while (0)
>
> +static void check_cyclic_buffer_overrun(void);
> static void setup_page_is_buddy(void);
>
> void
> @@ -2832,6 +2833,9 @@ out:
> !sadump_generate_elf_note_from_dumpfile())
> return FALSE;
>
> + if (info->flag_cyclic && info->dump_level & DL_EXCLUDE_FREE)
> + check_cyclic_buffer_overrun();
> +
> } else {
> if (!get_mem_map_without_mm())
> return FALSE;
> @@ -3669,6 +3673,61 @@ exclude_free_page(void)
> }
>
> /*
> + * Let C be a cyclic buffer size and B a bitmap size used for
> + * representing maximum block size managed by buddy allocator.
> + *
> + * For some combinations of C and B, clearing operation can overrun
> + * the cyclic buffer. Let's consider three cases.
> + *
> + * - If C == B, this is trivially safe.
> + *
> + * - If B > C, overrun can easily happen.
> + *
> + * - In case of C > B, if C mod B != 0, then there exist n > m > 0,
> + * B > b > 0 such that n x C = m x B + b. This means that clearing
> + * operation overruns cyclic buffer (B - b)-bytes in the
> + * combination of n-th cycle and m-th block.
> + *
> + * Note that C mod B != 0 iff (m x C) mod B != 0 for some m.
> + *
> + * If C == B, C mod B == 0 always holds. Again, if B > C, C mod B != 0
> + * always holds. Hence, it's always sufficient to check the condition
> + * C mod B != 0 in order to determine whether overrun can happen or
> + * not.
> + *
> + * The bitmap size used for maximum block size B is calculated from
> + * MAX_ORDER as:
> + *
> + * B := DIVIDE_UP((1 << (MAX_ORDER - 1)), BITS_PER_BYTE)
> + *
> + * Normally, MAX_ORDER is 11 at default. This is configurable through
> + * CONFIG_FORCE_MAX_ZONEORDER.
> + */
> +static void
> +check_cyclic_buffer_overrun(void)
> +{
> + int max_order = ARRAY_LENGTH(zone.free_area);
> + int max_order_nr_pages = 1 << (max_order - 1);
> + unsigned long max_block_size = roundup(max_order_nr_pages, BITPERBYTE);
> +
> + if (info->bufsize_cyclic %
> + roundup(max_order_nr_pages, BITPERBYTE)) {
> + unsigned long bufsize;
> +
> + if (max_block_size > info->bufsize_cyclic) {
> + MSG("WARNING: some free pages are not filtered.\n");
> + return;
> + }
> +
> + bufsize = info->bufsize_cyclic;
> + info->bufsize_cyclic = round(bufsize, max_block_size);
> +
> + MSG("cyclic buffer size has been changed: %lu => %lu\n",
> + bufsize, info->bufsize_cyclic);
> + }
> +}
> +
> +/*
> * For the kernel versions from v2.6.15 to v2.6.17.
> */
> static int
> @@ -4013,8 +4072,18 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> && info->page_is_buddy(flags, _mapcount, private, _count)) {
> int i;
>
> - for (i = 0; i < (1 << private); ++i)
> + for (i = 0; i < (1 << private); ++i) {
> + /*
> + * According to combination of
> + * MAX_ORDER and size of cyclic
> + * buffer, this clearing bit operation
> + * can overrun the cyclic buffer.
> + *
> + * See check_cyclic_buffer_overrun()
> + * for the detail.
> + */
> clear_bit_on_2nd_bitmap_for_kernel(pfn + i);
> + }
> pfn_free += i;
> }
> /*
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible
2013-09-11 7:51 ` Atsushi Kumagai
@ 2013-09-11 8:35 ` HATAYAMA Daisuke
2013-09-12 2:00 ` HATAYAMA Daisuke
0 siblings, 1 reply; 20+ messages in thread
From: HATAYAMA Daisuke @ 2013-09-11 8:35 UTC (permalink / raw)
To: Atsushi Kumagai; +Cc: kexec
(2013/09/11 16:51), Atsushi Kumagai wrote:
> Hello HATAYAMA-san,
>
> (2012/11/16 14:02), HATAYAMA Daisuke wrote:
>> Clearling bits on cyclic buffer can overrun the cyclic buffer
>> according to some combination of MAX_ORDER and cyclic buffer size.
>>
>> The cyclic buffer size is corrected if possible.
>>
>> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>
> I know it's so late, I found that updating pfn_cyclic is missing.
> It can cause memory corruption.
>
Hello Kumagai-san,
Reviewed-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
It might be even better to introduce some kind of helper function that
sets up these cyclic-mode-related parameters and then to use it in
initial() and check_cyclic_buffer_overrun().
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible
2013-09-11 8:35 ` HATAYAMA Daisuke
@ 2013-09-12 2:00 ` HATAYAMA Daisuke
2013-09-12 6:17 ` Atsushi Kumagai
0 siblings, 1 reply; 20+ messages in thread
From: HATAYAMA Daisuke @ 2013-09-12 2:00 UTC (permalink / raw)
To: Atsushi Kumagai; +Cc: kexec
(2013/09/11 17:35), HATAYAMA Daisuke wrote:
> (2013/09/11 16:51), Atsushi Kumagai wrote:
>> Hello HATAYAMA-san,
>>
>> (2012/11/16 14:02), HATAYAMA Daisuke wrote:
>>> Clearling bits on cyclic buffer can overrun the cyclic buffer
>>> according to some combination of MAX_ORDER and cyclic buffer size.
>>>
>>> The cyclic buffer size is corrected if possible.
>>>
>>> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>>
>> I know it's so late, I found that updating pfn_cyclic is missing.
>> It can cause memory corruption.
>>
>
> Hello Kumagai-san,
>
> Reviewed-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>
> It might be even better to introduce some kind of helper function that
> sets up these cyclic-mode-related parameters and then to use it in
> initial() and check_cyclic_buffer_overrun().
>
Hello Kumaga-san,
I found one more bug. Could you review it?
From c98375b9af6c19dff88823166eaf13674b4a47ec Mon Sep 17 00:00:00 2001
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Date: Thu, 12 Sep 2013 10:35:17 +0900
Subject: [PATCH] Use divideup() to calculate maximum required bitmap size
Currently, check_cyclic_buffer_overrun() wrongly calculates maximum
bitmap size required to represent maximum block size managed by buddy
allocator with roundup(). Then, max_block_size is BITPERBYTE-time
larger than its correct size. As a result, although the bug never
affect free-page filtering since roundup(max_order_nr_pages,
BITPERBYTE) is a multiple of divideup(max_order_nr_pages, BITPERBYTE),
the following sanity check, (max_block_size > info->bufsize_cyclic),
and recalculation of info->bufsize_cyclic becomes BITPERBYTE-time
conservative and inefficient.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
---
makedumpfile.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 164b3f1..e66c494 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -4078,10 +4078,10 @@ check_cyclic_buffer_overrun(void)
{
int max_order = ARRAY_LENGTH(zone.free_area);
int max_order_nr_pages = 1 << (max_order - 1);
- unsigned long max_block_size = roundup(max_order_nr_pages, BITPERBYTE);
+ unsigned long max_block_size = divideup(max_order_nr_pages,
+ BITPERBYTE);
- if (info->bufsize_cyclic %
- roundup(max_order_nr_pages, BITPERBYTE)) {
+ if (info->bufsize_cyclic % max_block_size) {
unsigned long bufsize;
if (max_block_size > info->bufsize_cyclic) {
--
1.8.3.1
--
Thanks.
HATAYAMA, Daisuke
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible
2013-09-12 2:00 ` HATAYAMA Daisuke
@ 2013-09-12 6:17 ` Atsushi Kumagai
0 siblings, 0 replies; 20+ messages in thread
From: Atsushi Kumagai @ 2013-09-12 6:17 UTC (permalink / raw)
To: d.hatayama@jp.fujitsu.com; +Cc: kexec@lists.infradead.org
Hello HATAYAMA-san,
(2013/09/12 11:01), HATAYAMA Daisuke wrote:
> (2013/09/11 17:35), HATAYAMA Daisuke wrote:
>> (2013/09/11 16:51), Atsushi Kumagai wrote:
>>> Hello HATAYAMA-san,
>>>
>>> (2012/11/16 14:02), HATAYAMA Daisuke wrote:
>>>> Clearling bits on cyclic buffer can overrun the cyclic buffer
>>>> according to some combination of MAX_ORDER and cyclic buffer size.
>>>>
>>>> The cyclic buffer size is corrected if possible.
>>>>
>>>> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>>>
>>> I know it's so late, I found that updating pfn_cyclic is missing.
>>> It can cause memory corruption.
>>>
>>
>> Hello Kumagai-san,
>>
>> Reviewed-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
>>
>> It might be even better to introduce some kind of helper function that
>> sets up these cyclic-mode-related parameters and then to use it in
>> initial() and check_cyclic_buffer_overrun().
>>
>
> Hello Kumaga-san,
>
> I found one more bug. Could you review it?
Thanks, acked and pushed to devel branch.
Atsushi Kumagai
>
> From c98375b9af6c19dff88823166eaf13674b4a47ec Mon Sep 17 00:00:00 2001
> From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> Date: Thu, 12 Sep 2013 10:35:17 +0900
> Subject: [PATCH] Use divideup() to calculate maximum required bitmap size
>
> Currently, check_cyclic_buffer_overrun() wrongly calculates maximum
> bitmap size required to represent maximum block size managed by buddy
> allocator with roundup(). Then, max_block_size is BITPERBYTE-time
> larger than its correct size. As a result, although the bug never
> affect free-page filtering since roundup(max_order_nr_pages,
> BITPERBYTE) is a multiple of divideup(max_order_nr_pages, BITPERBYTE),
> the following sanity check, (max_block_size > info->bufsize_cyclic),
> and recalculation of info->bufsize_cyclic becomes BITPERBYTE-time
> conservative and inefficient.
>
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
> ---
> makedumpfile.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 164b3f1..e66c494 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -4078,10 +4078,10 @@ check_cyclic_buffer_overrun(void)
> {
> int max_order = ARRAY_LENGTH(zone.free_area);
> int max_order_nr_pages = 1 << (max_order - 1);
> - unsigned long max_block_size = roundup(max_order_nr_pages, BITPERBYTE);
> + unsigned long max_block_size = divideup(max_order_nr_pages,
> + BITPERBYTE);
>
> - if (info->bufsize_cyclic %
> - roundup(max_order_nr_pages, BITPERBYTE)) {
> + if (info->bufsize_cyclic % max_block_size) {
> unsigned long bufsize;
>
> if (max_block_size > info->bufsize_cyclic) {
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2013-09-12 6:26 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-16 5:01 [PATCH v2 00/10] Support free page filtering looking up mem_map array HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 01/10] Move page flags setup for old kernels after debuginfo initialization HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 02/10] Add debuginfo interface for enum type size HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 03/10] Add new parameters to various tables HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 04/10] Add debuginfo-related processing for VMCOREINFO/VMLINUX HATAYAMA Daisuke
2012-11-16 5:01 ` [PATCH v2 05/10] Add hardcoded page flag values HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 06/10] Exclude free pages by looking up mem_map array HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 07/10] Add page_is_buddy for recent kernels HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 08/10] Add page_is_buddy for PG_buddy HATAYAMA Daisuke
2012-11-27 6:00 ` Atsushi Kumagai
2012-11-27 7:30 ` Atsushi Kumagai
2012-11-27 8:53 ` Hatayama, Daisuke
2012-11-28 7:42 ` Atsushi Kumagai
2012-11-16 5:02 ` [PATCH v2 09/10] Add page_is_buddy for old kernels HATAYAMA Daisuke
2012-11-16 5:02 ` [PATCH v2 10/10] Warn cyclic buffer overrun and correct it if possible HATAYAMA Daisuke
2013-09-11 7:51 ` Atsushi Kumagai
2013-09-11 8:35 ` HATAYAMA Daisuke
2013-09-12 2:00 ` HATAYAMA Daisuke
2013-09-12 6:17 ` Atsushi Kumagai
2012-11-16 7:05 ` [PATCH v2 00/10] Support free page filtering looking up mem_map array Atsushi Kumagai
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox