Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
@ 2014-10-13  9:34 Zhou Wenjian
  2014-10-13  9:34 ` [PATCH v2 1/5] Add support for splitblock Zhou Wenjian
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Zhou Wenjian @ 2014-10-13  9:34 UTC (permalink / raw)
  To: kexec

The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html

This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
The tables below show the performence with different size of cyclic-buffer and splitblock.
The test is executed on the machine having 128G memory.

the value is total time (including first pass and second pass).
the value in brackets is the time of second pass.
															      sec
	cyclic-buffer	1		2		4		8		16		32		64
splitblock-size
1M			4.74(0.00)	4.22(0.01)	3.94(0.01)	3.78(0.02)	3.71(0.03)	3.73(0.07)	3.74(0.10)	
2M			4.74(0.00)	4.19(0.00)	3.94(0.01)	3.80(0.03)	3.71(0.03)	3.72(0.07)	3.72(0.09)	
4M			4.73(0.00)	4.21(0.01)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.73(0.08)	3.73(0.10)	
8M			4.73(0.00)	4.19(0.00)	3.94(0.01)	3.83(0.02)	3.73(0.03)	3.72(0.07)	3.74(0.10)	
16M			4.74(0.01)	4.21(0.00)	3.94(0.01)	3.76(0.01)	3.73(0.03)	3.73(0.08)	3.74(0.10)	
32M			4.72(0.00)	4.20(0.02)	3.92(0.01)	3.77(0.02)	3.71(0.02)	3.70(0.06)	3.74(0.10)	
64M			4.74(0.01)	4.20(0.00)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.71(0.07)	3.72(0.09)	
128M			4.73(0.01)	4.20(0.00)	3.94(0.01)	3.78(0.02)	3.76(0.03)	3.72(0.08)	3.74(0.09)	
256M			4.75(0.02)	4.22(0.02)	3.96(0.03)	3.78(0.02)	3.70(0.03)	3.70(0.07)	3.74(0.11)	
512M			4.77(0.04)	4.21(0.03)	3.97(0.04)	3.79(0.03)	3.73(0.04)	3.75(0.09)	3.82(0.13)	
1G			4.82(0.09)	4.26(0.07)	4.00(0.08)	3.83(0.07)	3.76(0.08)	3.73(0.08)	3.76(0.12)	
2G			8.26(3.54)	7.34(3.14)	6.86(2.93)	6.56(2.80)	6.44(2.76)	6.45(2.79)	6.42(2.80)

the performence of 3-pass algorhythm
origin			8.25(3.54)	7.26(3.11)	6.80(2.91)	6.52(2.80)	6.39(2.76)	6.40(2.78)	6.45(2.85)

															       sec
	cyclic-buffer	128		256		512		1024		2048		4096		8192	
splitblock-size
1M			3.83(0.21)	3.94(0.33)	4.16(0.54)	4.61(0.99)	7.03(3.41)	8.73(5.11)	8.69(5.08)
2M			3.86(0.21)	3.92(0.32)	4.16(0.54)	4.64(0.98)	7.02(3.41)	8.71(5.09)	8.72(5.09)
4M			3.82(0.21)	3.95(0.32)	4.18(0.55)	4.62(0.99)	7.05(3.44)	8.70(5.09)	8.68(5.07)
8M			3.82(0.21)	3.95(0.33)	4.17(0.54)	4.58(0.97)	7.03(3.41)	8.79(5.16)	8.71(5.09)
16M			3.83(0.21)	3.93(0.31)	4.15(0.54)	4.60(0.98)	7.06(3.43)	8.76(5.13)	8.73(5.10)
32M			3.84(0.22)	3.93(0.32)	4.15(0.54)	4.61(0.98)	7.00(3.40)	8.69(5.08)	8.75(5.13)
64M			3.84(0.21)	3.94(0.33)	4.15(0.54)	4.60(0.98)	7.04(3.42)	8.74(5.10)	8.80(5.16)
128M			3.85(0.22)	3.97(0.33)	4.16(0.54)	4.60(0.98)	7.07(3.44)	8.68(5.07)	8.69(5.07)
256M			3.84(0.21)	3.94(0.33)	4.16(0.55)	4.64(1.00)	7.02(3.41)	8.74(5.11)	8.73(5.11)
512M			3.85(0.24)	3.97(0.34)	4.17(0.56)	4.61(0.99)	7.05(3.44)	8.73(5.11)	8.75(5.13)
1G			3.85(0.22)	3.96(0.35)	4.18(0.56)	4.65(1.00)	7.06(3.44)	8.76(5.12)	8.72(5.11)
2G			6.53(2.91)	6.86(3.25)	7.54(3.92)	8.95(5.31)	10.60(6.97)	14.08(10.47)	14.32(10.60)

the performence of 3-pass algorhythm
origin			6.64(3.05)	6.81(3.24)	7.51(3.93)	8.86(5.30)	10.51(6.94)	13.92(10.36)	14.11(10.55)


v1->v2:
	use splitblock instead of block
	add restriction (align to the page size) to splitblock size
	adjust the position of prepare_splitblock_table and check the return code
	use --splitblock-size to specify splitblock size and modify the print_info.c
	

Zhou Wenjian (5):
  Add support for splitblock
  Add tools for reading and writing from splitblock table
  Add module of generating table
  Add module of calculating start_pfn and end_pfn in each dumpfile
  Add support for --splitblock-size

 makedumpfile.8 |   16 ++++
 makedumpfile.c |  254 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 makedumpfile.h |   17 ++++
 print_info.c   |   16 ++++-
 4 files changed, 296 insertions(+), 7 deletions(-)

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/5] Add support for splitblock
  2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
@ 2014-10-13  9:34 ` Zhou Wenjian
  2014-10-28  6:30   ` HATAYAMA Daisuke
  2014-10-13  9:34 ` [PATCH v2 2/5] Add tools for reading and writing from splitblock table Zhou Wenjian
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Zhou Wenjian @ 2014-10-13  9:34 UTC (permalink / raw)
  To: kexec

When --split option is specified, fair I/O workloads shoud be assigned
for each process. So the start and end pfn of each dumpfile should be
calculated with excluding unnecessary pages. However, it costs a lot of
time to execute excluding for the whole memory. That is why struct
SplitBlock exists. Struct SplitBlock is designed to manage memory, mainly
for recording the number of dumpable pages. We can use the number of
dumpable pages to calculate start and end pfn instead of execute excluding
for the whole memory.

The char array *table in struct SplitBlock is used to record the number of
dumpable pages.
The table entry size is calculated as
			divideup(log2(splitblock_size / page_size), 8) bytes
The table entry size is calculated, so that the
space table taken will be small enough. And the code will also have a
good performence when the number of pages in one splitblock is big enough.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |   23 +++++++++++++++++++++++
 makedumpfile.h |   14 ++++++++++++++
 2 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index b4d43d8..95d553c 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -34,6 +34,7 @@ struct srcfile_table	srcfile_table;
 
 struct vm_table		vt = { 0 };
 struct DumpInfo		*info = NULL;
+struct SplitBlock		*splitblock = NULL;
 
 char filename_stdout[] = FILENAME_STDOUT;
 
@@ -5685,6 +5686,28 @@ out:
 	return ret;
 }
 
+/*
+ * cyclic_split mode:
+ *	manage memory by splitblocks,
+ *	divide memory into splitblocks
+ *	use splitblock_table to record numbers of dumpable pages in each splitblock
+ */
+
+//calculate entry size based on the amount of pages in one splitblock
+int
+calculate_entry_size(void){
+	int entry_num = 1, count = 1;
+	int entry_size;
+	while (entry_num < splitblock->page_per_splitblock){
+		entry_num = entry_num << 1;
+		count++;
+	}
+	entry_size = count/BITPERBYTE;
+	if (count %BITPERBYTE)
+		entry_size++;
+	return entry_size;
+}
+
 mdf_pfn_t
 get_num_dumpable(void)
 {
diff --git a/makedumpfile.h b/makedumpfile.h
index 96830b0..98b8404 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1168,10 +1168,24 @@ struct DumpInfo {
 	 */
 	int (*page_is_buddy)(unsigned long flags, unsigned int _mapcount,
 			     unsigned long private, unsigned int _count);
+	/*
+	 *for cyclic_splitting mode, setup splitblock_size
+	 */
+	long long splitblock_size;
 };
 extern struct DumpInfo		*info;
 
 /*
+ *for cyclic_splitting mode,Manage memory by splitblock
+ */
+struct SplitBlock{
+        char *table;
+        long long num;
+        long long page_per_splitblock;
+        int entry_size;                 //counted by byte
+};
+
+/*
  * kernel VM-related data
  */
 struct vm_table {
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/5] Add tools for reading and writing from splitblock table
  2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
  2014-10-13  9:34 ` [PATCH v2 1/5] Add support for splitblock Zhou Wenjian
@ 2014-10-13  9:34 ` Zhou Wenjian
  2014-10-28  6:42   ` HATAYAMA Daisuke
  2014-10-13  9:34 ` [PATCH v2 3/5] Add module of generating table Zhou Wenjian
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Zhou Wenjian @ 2014-10-13  9:34 UTC (permalink / raw)
  To: kexec

The function added in this patch, is used for writing and reading value
from the char array in struct SplitBlock.

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index 95d553c..a8d86f6 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -5708,6 +5708,29 @@ calculate_entry_size(void){
 	return entry_size;
 }
 
+void
+write_value_into_splitblock_table(char *splitblock_inner, unsigned long long content)
+{
+	char temp;
+	int i=0;
+	while (i++ < splitblock->entry_size) {
+		temp = content & 0xff;
+		content = content >> BITPERBYTE;
+		*splitblock_inner++ = temp;
+	}
+}
+unsigned long long
+read_value_from_splitblock_table(char *splitblock_inner)
+{
+	unsigned long long ret = 0;
+	int i;
+	for (i = splitblock->entry_size; i > 0; i--) {
+		ret = ret << BITPERBYTE;
+		ret += *(splitblock_inner + i - 1) & 0xff;
+	}
+	return ret;
+}
+
 mdf_pfn_t
 get_num_dumpable(void)
 {
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/5] Add module of generating table
  2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
  2014-10-13  9:34 ` [PATCH v2 1/5] Add support for splitblock Zhou Wenjian
  2014-10-13  9:34 ` [PATCH v2 2/5] Add tools for reading and writing from splitblock table Zhou Wenjian
@ 2014-10-13  9:34 ` Zhou Wenjian
  2014-10-28  7:01   ` HATAYAMA Daisuke
  2014-10-13  9:34 ` [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Zhou Wenjian @ 2014-10-13  9:34 UTC (permalink / raw)
  To: kexec

set block size and generate basic information of block table

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |   95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 makedumpfile.h |    2 +
 2 files changed, 96 insertions(+), 1 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index a8d86f6..a6f0be4 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -5208,7 +5208,13 @@ create_dump_bitmap(void)
 	if (info->flag_cyclic) {
 		if (!prepare_bitmap2_buffer_cyclic())
 			goto out;
-		info->num_dumpable = get_num_dumpable_cyclic();
+		if (info->flag_split){
+			if(!prepare_splitblock_table())
+				goto out;
+			info->num_dumpable = get_num_dumpable_cyclic_withsplit();
+		}
+		else
+			info->num_dumpable = get_num_dumpable_cyclic();
 
 		if (!info->flag_elf_dumpfile)
 			free_bitmap2_buffer_cyclic();
@@ -5731,6 +5737,57 @@ read_value_from_splitblock_table(char *splitblock_inner)
 	return ret;
 }
 
+/*
+ * The splitblock size is specified as Kbyte with --splitblock-size <size> option.
+ * if not specified ,set default value
+ */
+int
+check_splitblock_size(void)
+{
+	if (info->splitblock_size){
+		info->splitblock_size <<= 10;
+		if (info->splitblock_size == 0) {
+			ERRMSG("The splitblock size could not be 0. %s.\n", strerror(errno));
+			return FALSE;
+		}
+		if (info->splitblock_size % info->page_size != 0) {
+			ERRMSG("The splitblock size must be align to page_size. %s.\n",
+									strerror(errno));
+			return FALSE;
+		}
+	}
+	else{
+		// set default 1GB
+		info->splitblock_size = 1 << 30;
+	}
+	return TRUE;
+}
+
+int
+prepare_splitblock_table(void)
+{
+	if(!check_splitblock_size())
+		return FALSE;
+	if ((splitblock = calloc(1, sizeof(struct SplitBlock))) == NULL) {
+		ERRMSG("Can't allocate memory for the splitblock. %s.\n", strerror(errno));
+		return FALSE;
+	}
+	splitblock->page_per_splitblock = info->splitblock_size / info->page_size;
+	/*
+	 *divide memory into splitblocks.
+	 *if there is a remainder, called it memory not managed by splitblock
+	 *and it will be also dealt with in function calculate_end_pfn_by_splitblock()
+	 */
+	splitblock->num = info->max_mapnr/splitblock->page_per_splitblock;
+	splitblock->entry_size = calculate_entry_size();
+	if ((splitblock->table = (char *)calloc(sizeof(char), (splitblock->entry_size * splitblock->num)))
+										== NULL) {
+		ERRMSG("Can't allocate memory for the splitblock_table. %s.\n", strerror(errno));
+		return FALSE;
+	}
+	return TRUE;
+}
+
 mdf_pfn_t
 get_num_dumpable(void)
 {
@@ -5746,6 +5803,36 @@ get_num_dumpable(void)
 	return num_dumpable;
 }
 
+/*
+ * generate splitblock_table
+ * modified from function get_num_dumpable_cyclic
+ */
+mdf_pfn_t
+get_num_dumpable_cyclic_withsplit(void)
+{
+	mdf_pfn_t pfn, num_dumpable = 0;
+	mdf_pfn_t dumpable_pfn_num = 0, pfn_num = 0;
+	struct cycle cycle = {0};
+	int pos = 0;
+	for_each_cycle(0, info->max_mapnr, &cycle) {
+		if (!exclude_unnecessary_pages_cyclic(&cycle))
+			return FALSE;
+		for (pfn = cycle.start_pfn; pfn < cycle.end_pfn; pfn++) {
+			if (is_dumpable_cyclic(info->partial_bitmap2, pfn, &cycle)) {
+				num_dumpable++;
+				dumpable_pfn_num++;
+			}
+			if (++pfn_num >= splitblock->page_per_splitblock) {
+				write_value_into_splitblock_table(splitblock->table + pos, dumpable_pfn_num);
+				pos += splitblock->entry_size;
+				pfn_num = 0;
+				dumpable_pfn_num = 0;
+			}
+		}
+	}
+	return num_dumpable;
+}
+
 mdf_pfn_t
 get_num_dumpable_cyclic(void)
 {
@@ -9703,6 +9790,12 @@ out:
 		if (info->page_buf != NULL)
 			free(info->page_buf);
 		free(info);
+
+		if (splitblock) {
+			if (splitblock->table)
+				free(splitblock->table);
+		free(splitblock);
+		}
 	}
 	free_elf_info();
 
diff --git a/makedumpfile.h b/makedumpfile.h
index 98b8404..60e6f2f 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1888,9 +1888,11 @@ struct elf_prstatus {
  * Function Prototype.
  */
 mdf_pfn_t get_num_dumpable_cyclic(void);
+mdf_pfn_t get_num_dumpable_cyclic_withsplit(void);
 int get_loads_dumpfile_cyclic(void);
 int initial_xen(void);
 unsigned long long get_free_memory_size(void);
 int calculate_cyclic_buffer_size(void);
+int prepare_splitblock_table(void);
 
 #endif /* MAKEDUMPFILE_H */
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile
  2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (2 preceding siblings ...)
  2014-10-13  9:34 ` [PATCH v2 3/5] Add module of generating table Zhou Wenjian
@ 2014-10-13  9:34 ` Zhou Wenjian
  2014-10-28  7:43   ` HATAYAMA Daisuke
  2014-10-13  9:34 ` [PATCH v2 5/5] Add support for --splitblock-size Zhou Wenjian
  2014-10-17  3:50 ` [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Atsushi Kumagai
  5 siblings, 1 reply; 21+ messages in thread
From: Zhou Wenjian @ 2014-10-13  9:34 UTC (permalink / raw)
  To: kexec

When --split is specified in cyclic mode, start_pfn and end_pfn of each dumpfile
will be calculated to make each dumpfile have the same size.

Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.c |  109 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 files changed, 104 insertions(+), 5 deletions(-)

diff --git a/makedumpfile.c b/makedumpfile.c
index a6f0be4..32c0919 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -8190,6 +8190,103 @@ out:
 		return ret;
 }
 
+/*
+ * calculate end pfn in incomplete splitblock or memory not managed by splitblock
+ */
+mdf_pfn_t
+calculate_end_pfn_in_cycle(mdf_pfn_t start, mdf_pfn_t max,
+			    mdf_pfn_t end_pfn, long long pfn_needed_by_per_dumpfile)
+{
+	struct cycle cycle;
+	for_each_cycle(start,max,&cycle) {
+		if (!exclude_unnecessary_pages_cyclic(&cycle))
+			return FALSE;
+		while (end_pfn < cycle.end_pfn) {
+			end_pfn++;
+			if (is_dumpable_cyclic(info->partial_bitmap2, end_pfn, &cycle)){
+				if (--pfn_needed_by_per_dumpfile <= 0)
+					return ++end_pfn;
+			}
+		}
+	}
+	return ++end_pfn;
+}
+
+/*
+ * calculate end_pfn of one dumpfile.
+ * try to make every output file have the same size.
+ * splitblock_table is used to reduce calculate time.
+ */
+
+#define CURRENT_SPLITBLOCK_PFN_NUM (*current_splitblock * splitblock->page_per_splitblock)
+mdf_pfn_t
+calculate_end_pfn_by_splitblock(mdf_pfn_t start_pfn,
+			   int *current_splitblock,
+			   long long *current_splitblock_pfns){
+	mdf_pfn_t end_pfn;
+	long long pfn_needed_by_per_dumpfile,offset;
+	pfn_needed_by_per_dumpfile = info->num_dumpable / info->num_dumpfile;
+	offset = *current_splitblock * splitblock->entry_size;
+	end_pfn = start_pfn;
+	char *splitblock_inner = splitblock->table + offset;
+	//calculate the part containing complete splitblock
+	while (*current_splitblock < splitblock->num && pfn_needed_by_per_dumpfile > 0) {
+		if (*current_splitblock_pfns > 0) {
+			pfn_needed_by_per_dumpfile -= *current_splitblock_pfns ;
+			*current_splitblock_pfns = 0 ;
+		}
+		else
+		pfn_needed_by_per_dumpfile -= read_value_from_splitblock_table(splitblock_inner);
+		splitblock_inner += splitblock->entry_size;
+		++*current_splitblock;
+	}
+	//deal with complete splitblock
+	if (pfn_needed_by_per_dumpfile == 0)
+		end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
+	//deal with incomplete splitblock
+	if (pfn_needed_by_per_dumpfile < 0) {
+		--*current_splitblock;
+		splitblock_inner -= splitblock->entry_size;
+		end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
+		*current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
+		pfn_needed_by_per_dumpfile += read_value_from_splitblock_table(splitblock_inner);
+		end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
+						     CURRENT_SPLITBLOCK_PFN_NUM+splitblock->page_per_splitblock,
+						     end_pfn,pfn_needed_by_per_dumpfile);
+	}
+	//deal with memory not managed by splitblock
+	if (pfn_needed_by_per_dumpfile > 0 && *current_splitblock >= splitblock->num) {
+		mdf_pfn_t cycle_start_pfn = MAX(CURRENT_SPLITBLOCK_PFN_NUM,end_pfn);
+		end_pfn=calculate_end_pfn_in_cycle(cycle_start_pfn,
+						   info->max_mapnr,
+						   end_pfn,
+						   pfn_needed_by_per_dumpfile);
+	}
+	return end_pfn;
+}
+/*
+ * calculate start_pfn and end_pfn in each output file.
+ */
+static int setup_splitting_cyclic(void)
+{
+	int i;
+	mdf_pfn_t start_pfn, end_pfn;
+	long long current_splitblock_pfns = 0;
+	int current_splitblock = 0;
+	start_pfn = end_pfn = 0;
+	for (i = 0; i < info->num_dumpfile - 1; i++) {
+		start_pfn = end_pfn;
+		end_pfn = calculate_end_pfn_by_splitblock(start_pfn,
+							  &current_splitblock,
+							  &current_splitblock_pfns);
+		SPLITTING_START_PFN(i) = start_pfn;
+		SPLITTING_END_PFN(i) = end_pfn;
+	}
+	SPLITTING_START_PFN(info->num_dumpfile - 1) = end_pfn;
+	SPLITTING_END_PFN(info->num_dumpfile - 1) = info->max_mapnr;
+	return TRUE;
+}
+
 int
 setup_splitting(void)
 {
@@ -8203,12 +8300,14 @@ setup_splitting(void)
 		return FALSE;
 
 	if (info->flag_cyclic) {
-		for (i = 0; i < info->num_dumpfile; i++) {
-			SPLITTING_START_PFN(i) = divideup(info->max_mapnr, info->num_dumpfile) * i;
-			SPLITTING_END_PFN(i)   = divideup(info->max_mapnr, info->num_dumpfile) * (i + 1);
+		int ret = FALSE;
+		if(!prepare_bitmap2_buffer_cyclic()){
+			free_bitmap_buffer();
+			return ret;
 		}
-		if (SPLITTING_END_PFN(i-1) > info->max_mapnr)
-			SPLITTING_END_PFN(i-1) = info->max_mapnr;
+		ret = setup_splitting_cyclic();
+		free_bitmap2_buffer_cyclic();
+		return ret;
         } else {
 		initialize_2nd_bitmap(&bitmap2);
 
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 5/5] Add support for --splitblock-size
  2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (3 preceding siblings ...)
  2014-10-13  9:34 ` [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
@ 2014-10-13  9:34 ` Zhou Wenjian
  2014-10-28  7:15   ` HATAYAMA Daisuke
  2014-10-17  3:50 ` [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Atsushi Kumagai
  5 siblings, 1 reply; 21+ messages in thread
From: Zhou Wenjian @ 2014-10-13  9:34 UTC (permalink / raw)
  To: kexec

Use --splitblock-size to specify splitblock size (KB)
When --split is specified in cyclic mode,splitblock table will be
generated in create_dump_bitmap().

Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
---
 makedumpfile.8 |   16 ++++++++++++++++
 makedumpfile.c |    4 ++++
 makedumpfile.h |    1 +
 print_info.c   |   16 +++++++++++++++-
 4 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/makedumpfile.8 b/makedumpfile.8
index 9cb12c0..a5b7055 100644
--- a/makedumpfile.8
+++ b/makedumpfile.8
@@ -386,6 +386,22 @@ size, so ordinary users don't need to specify this option.
 # makedumpfile \-\-cyclic\-buffer 1024 \-d 31 \-x vmlinux /proc/vmcore dumpfile
 
 .TP
+\fB\-\-splitblock\-size\fR \fIsplitblock_size\fR
+Specify the splitblock size in kilo bytes for analysis in the cyclic mode with --split.
+In the cyclic split mode, the number of splitblocks is represented as:
+
+    num_of_splitblocks = system_memory / (\fIsplitblock_size\fR * 1KB )
+
+The larger number of splitblock, the faster working speed is expected, but the more memory will
+be taken. By default, \fIsplitblock_size\fR will be set as 1GB, so ordinary users don't need to
+specify this option.
+
+.br
+.B Example:
+.br
+# makedumpfile \-\-splitblock\-size 10240 \-d 31 \-x vmlinux \-\-split /proc/vmcore dumpfile1 dumpfile2
+
+.TP
 \fB\-\-non\-cyclic\fR
 Running in the non-cyclic mode, this mode uses the old filtering logic same as v1.4.4 or before.
 If you feel the cyclic mode is too slow, please try this mode.
diff --git a/makedumpfile.c b/makedumpfile.c
index 32c0919..112f2e4 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -9578,6 +9578,7 @@ static struct option longopts[] = {
 	{"eppic", required_argument, NULL, OPT_EPPIC},
 	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
 	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
+	{"splitblock-size", required_argument, NULL, OPT_SPLITBLOCK_SIZE},
 	{0, 0, 0, 0}
 };
 
@@ -9718,6 +9719,9 @@ main(int argc, char *argv[])
 		case OPT_CYCLIC_BUFFER:
 			info->bufsize_cyclic = atoi(optarg);
 			break;
+		case OPT_SPLITBLOCK_SIZE:
+			info->splitblock_size = atoi(optarg);
+			break;
 		case '?':
 			MSG("Commandline parameter is invalid.\n");
 			MSG("Try `makedumpfile --help' for more information.\n");
diff --git a/makedumpfile.h b/makedumpfile.h
index 60e6f2f..7bc57d9 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -1883,6 +1883,7 @@ struct elf_prstatus {
 #define OPT_EPPIC               OPT_START+12
 #define OPT_NON_MMAP            OPT_START+13
 #define OPT_MEM_USAGE            OPT_START+14
+#define OPT_SPLITBLOCK_SIZE		OPT_START+15
 
 /*
  * Function Prototype.
diff --git a/print_info.c b/print_info.c
index f6342d3..2cdffd8 100644
--- a/print_info.c
+++ b/print_info.c
@@ -203,7 +203,21 @@ print_usage(void)
 	MSG("      By default, BUFFER_SIZE will be calculated automatically depending on\n");
 	MSG("      system memory size, so ordinary users don't need to specify this option.\n");
 	MSG("\n");
-	MSG("  [--non-cyclic]:\n");
+	MSG("  [--splitblock-size SPLITBLOCK_SIZE]:\n");
+	MSG("      Specify the splitblock size in kilo bytes for analysis in the cyclic mode\n");
+	MSG("      with --split.\n");
+	MSG("      In the cyclic mode, the number of splitblocks is represented as:\n");
+	MSG("\n");
+	MSG("          num_of_splitblocks = system_memory / (splitblock_size * 1KB)\n");
+	MSG("\n");
+	MSG("	   The larger number of splitblock, the faster working speed is expected, but\n");
+	MSG("	   the more memory will be taken. By default, splitblock_size will be set as\n");
+	MSG("	   1GB, so ordinary users don't need to specify this option.\n");
+	MSG("\n");
+	MSG("      The lesser number of cycles, the faster working speed is expected.\n");
+	MSG("      By default, BUFFER_SIZE will be calculated automatically depending on\n");
+	MSG("      system memory size, so ordinary users don't need to specify this option.\n");
+	MSG("\n");	MSG("  [--non-cyclic]:\n");
 	MSG("      Running in the non-cyclic mode, this mode uses the old filtering logic\n");
 	MSG("      same as v1.4.4 or before.\n");
 	MSG("      If you feel the cyclic mode is too slow, please try this mode.\n");
-- 
1.7.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
                   ` (4 preceding siblings ...)
  2014-10-13  9:34 ` [PATCH v2 5/5] Add support for --splitblock-size Zhou Wenjian
@ 2014-10-17  3:50 ` Atsushi Kumagai
  2014-10-22  1:52   ` "Zhou, Wenjian/周文剑"
  2014-10-27  6:19   ` "Zhou, Wenjian/周文剑"
  5 siblings, 2 replies; 21+ messages in thread
From: Atsushi Kumagai @ 2014-10-17  3:50 UTC (permalink / raw)
  To: zhouwj-fnst@cn.fujitsu.com; +Cc: kexec@lists.infradead.org

Hello,

The code looks good to me, thanks Zhou.
Now, I have a question on performance.

>The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>
>This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
>Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
>The tables below show the performence with different size of cyclic-buffer and splitblock.
>The test is executed on the machine having 128G memory.
>
>the value is total time (including first pass and second pass).
>the value in brackets is the time of second pass.

Do you have any idea why the time of second pass is much larger when
the splitblock-size is 2G ? I worry about the scalability.


Thanks
Atsushi Kumagai

>	      sec
>	cyclic-buffer	1		2		4		8		16		32
>	64
>splitblock-size
>1M			4.74(0.00)	4.22(0.01)	3.94(0.01)	3.78(0.02)	3.71(0.03)	3.73(0.07)
>	3.74(0.10)
>2M			4.74(0.00)	4.19(0.00)	3.94(0.01)	3.80(0.03)	3.71(0.03)	3.72(0.07)
>	3.72(0.09)
>4M			4.73(0.00)	4.21(0.01)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.73(0.08)
>	3.73(0.10)
>8M			4.73(0.00)	4.19(0.00)	3.94(0.01)	3.83(0.02)	3.73(0.03)	3.72(0.07)
>	3.74(0.10)
>16M			4.74(0.01)	4.21(0.00)	3.94(0.01)	3.76(0.01)	3.73(0.03)	3.73(0.08)
>	3.74(0.10)
>32M			4.72(0.00)	4.20(0.02)	3.92(0.01)	3.77(0.02)	3.71(0.02)	3.70(0.06)
>	3.74(0.10)
>64M			4.74(0.01)	4.20(0.00)	3.95(0.01)	3.78(0.02)	3.70(0.02)	3.71(0.07)
>	3.72(0.09)
>128M			4.73(0.01)	4.20(0.00)	3.94(0.01)	3.78(0.02)	3.76(0.03)	3.72(0.08)
>	3.74(0.09)
>256M			4.75(0.02)	4.22(0.02)	3.96(0.03)	3.78(0.02)	3.70(0.03)	3.70(0.07)
>	3.74(0.11)
>512M			4.77(0.04)	4.21(0.03)	3.97(0.04)	3.79(0.03)	3.73(0.04)	3.75(0.09)
>	3.82(0.13)
>1G			4.82(0.09)	4.26(0.07)	4.00(0.08)	3.83(0.07)	3.76(0.08)	3.73(0.08)
>	3.76(0.12)
>2G			8.26(3.54)	7.34(3.14)	6.86(2.93)	6.56(2.80)	6.44(2.76)	6.45(2.79)
>	6.42(2.80)
>
>the performence of 3-pass algorhythm
>origin			8.25(3.54)	7.26(3.11)	6.80(2.91)	6.52(2.80)	6.39(2.76)	6.40(2.78)
>	6.45(2.85)
>
>
>	       sec
>	cyclic-buffer	128		256		512		1024		2048		4096
>	8192
>splitblock-size
>1M			3.83(0.21)	3.94(0.33)	4.16(0.54)	4.61(0.99)	7.03(3.41)	8.73(5.11)
>	8.69(5.08)
>2M			3.86(0.21)	3.92(0.32)	4.16(0.54)	4.64(0.98)	7.02(3.41)	8.71(5.09)
>	8.72(5.09)
>4M			3.82(0.21)	3.95(0.32)	4.18(0.55)	4.62(0.99)	7.05(3.44)	8.70(5.09)
>	8.68(5.07)
>8M			3.82(0.21)	3.95(0.33)	4.17(0.54)	4.58(0.97)	7.03(3.41)	8.79(5.16)
>	8.71(5.09)
>16M			3.83(0.21)	3.93(0.31)	4.15(0.54)	4.60(0.98)	7.06(3.43)	8.76(5.13)
>	8.73(5.10)
>32M			3.84(0.22)	3.93(0.32)	4.15(0.54)	4.61(0.98)	7.00(3.40)	8.69(5.08)
>	8.75(5.13)
>64M			3.84(0.21)	3.94(0.33)	4.15(0.54)	4.60(0.98)	7.04(3.42)	8.74(5.10)
>	8.80(5.16)
>128M			3.85(0.22)	3.97(0.33)	4.16(0.54)	4.60(0.98)	7.07(3.44)	8.68(5.07)
>	8.69(5.07)
>256M			3.84(0.21)	3.94(0.33)	4.16(0.55)	4.64(1.00)	7.02(3.41)	8.74(5.11)
>	8.73(5.11)
>512M			3.85(0.24)	3.97(0.34)	4.17(0.56)	4.61(0.99)	7.05(3.44)	8.73(5.11)
>	8.75(5.13)
>1G			3.85(0.22)	3.96(0.35)	4.18(0.56)	4.65(1.00)	7.06(3.44)	8.76(5.12)
>	8.72(5.11)
>2G			6.53(2.91)	6.86(3.25)	7.54(3.92)	8.95(5.31)	10.60(6.97)	14.08(10.47)
>	14.32(10.60)
>
>the performence of 3-pass algorhythm
>origin			6.64(3.05)	6.81(3.24)	7.51(3.93)	8.86(5.30)	10.51(6.94)	13.92(10.36)
>	14.11(10.55)
>
>
>v1->v2:
>	use splitblock instead of block
>	add restriction (align to the page size) to splitblock size
>	adjust the position of prepare_splitblock_table and check the return code
>	use --splitblock-size to specify splitblock size and modify the print_info.c
>
>
>Zhou Wenjian (5):
>  Add support for splitblock
>  Add tools for reading and writing from splitblock table
>  Add module of generating table
>  Add module of calculating start_pfn and end_pfn in each dumpfile
>  Add support for --splitblock-size
>
> makedumpfile.8 |   16 ++++
> makedumpfile.c |  254 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
> makedumpfile.h |   17 ++++
> print_info.c   |   16 ++++-
> 4 files changed, 296 insertions(+), 7 deletions(-)
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-17  3:50 ` [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Atsushi Kumagai
@ 2014-10-22  1:52   ` "Zhou, Wenjian/周文剑"
  2014-10-27  1:08     ` "Zhou, Wenjian/周文剑"
  2014-10-27  7:51     ` Atsushi Kumagai
  2014-10-27  6:19   ` "Zhou, Wenjian/周文剑"
  1 sibling, 2 replies; 21+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2014-10-22  1:52 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org

On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
> Hello,
>
> The code looks good to me, thanks Zhou.
> Now, I have a question on performance.
>
>> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>
>> This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
>> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
>> The tables below show the performence with different size of cyclic-buffer and splitblock.
>> The test is executed on the machine having 128G memory.
>>
>> the value is total time (including first pass and second pass).
>> the value in brackets is the time of second pass.
>
> Do you have any idea why the time of second pass is much larger when
> the splitblock-size is 2G ? I worry about the scalability.
>
Hello,

	Since the previous machine can't be used for some reasons,I test several times using latest code
in others, but that never happened. It seems that all things are right. Tests are executed in two machines(server,pc).
Tests are based on:
		test1:
			machine:	server
			crashkernel:	512M
			vmcore:		/proc/vmcore (128G)
			
		
		test2:
			machine:	pc
			crashkernel:	256M
			vmcore:		vmcore dumped from the server
	
		test3:
			machine:	pc
			crashkernel:	128M
			vmcore:		vmcore dumped from the server



test1:
													      sec
		cyc-buf	2		4		8		16		32		64	
	splblk-size
	2M		1.53(0.00)	1.24(0.00)	1.08(0.01)	0.99(0.01)	0.95(0.01)	0.95(0.04)
	4M		1.52(0.00)	1.24(0.00)	1.08(0.01)	0.99(0.01)	0.95(0.01)	0.96(0.04)
	8M		1.53(0.00)	1.23(0.00)	1.08(0.01)	0.99(0.01)	0.95(0.01)	0.95(0.04)
	16M		1.53(0.00)	1.24(0.00)	1.08(0.01)	1.00(0.01)	0.94(0.01)	0.96(0.04)
	32M		1.52(0.00)	1.25(0.00)	1.08(0.01)	0.99(0.01)	0.94(0.01)	0.96(0.05)
	64M		1.52(0.00)	1.23(0.00)	1.07(0.01)	0.99(0.01)	0.97(0.01)	0.96(0.05)
	128M		1.54(0.01)	1.25(0.01)	1.08(0.01)	0.99(0.01)	0.96(0.02)	0.96(0.05)
	256M		1.53(0.01)	1.25(0.01)	1.08(0.01)	0.99(0.01)	0.95(0.02)	0.96(0.05)
	512M		1.53(0.01)	1.25(0.01)	1.07(0.01)	1.01(0.02)	0.95(0.02)	0.96(0.05)
	1G		1.54(0.02)	1.25(0.02)	1.08(0.02)	1.00(0.02)	0.96(0.03)	0.97(0.06)
	2G		1.59(0.07)	1.30(0.06)	1.11(0.05)	1.04(0.06)	1.00(0.07)	0.98(0.07)
	4G		1.60(0.08)	1.31(0.07)	1.14(0.07)	1.05(0.07)	1.02(0.08)	1.00(0.08)
	8G		1.60(0.08)	1.31(0.07)	1.14(0.07)	1.05(0.07)	1.02(0.08)	0.99(0.08)
	16G		1.60(0.08)	1.30(0.07)	1.14(0.07)	1.05(0.07)	1.02(0.08)	0.99(0.08)
	
		cyc-buf	128		256		512		1024		2048		4096	
	splblk-size
	2M		0.95(0.04)	0.94(0.04)	0.94(0.04)	0.95(0.05)	0.94(0.05)	1.68(0.78)
	4M		0.95(0.04)	0.94(0.04)	0.94(0.04)	0.95(0.05)	0.94(0.05)	1.67(0.78)
	8M		0.94(0.04)	0.94(0.04)	0.94(0.04)	0.95(0.05)	0.95(0.05)	1.68(0.78)
	16M		0.96(0.05)	0.94(0.04)	0.95(0.05)	0.94(0.04)	0.94(0.05)	1.68(0.78)
	32M		0.95(0.05)	0.94(0.05)	0.94(0.05)	0.94(0.05)	0.94(0.05)	1.68(0.78)
	64M		0.96(0.05)	0.95(0.05)	0.94(0.05)	0.94(0.05)	0.95(0.05)	1.67(0.78)
	128M		0.96(0.05)	0.95(0.05)	0.95(0.05)	0.95(0.05)	0.94(0.05)	1.67(0.78)
	256M		0.95(0.05)	0.95(0.05)	0.95(0.05)	0.94(0.05)	0.94(0.05)	1.67(0.78)
	512M		0.96(0.05)	0.95(0.05)	0.94(0.05)	0.94(0.05)	0.95(0.05)	1.68(0.79)
	1G		0.96(0.06)	0.96(0.06)	0.97(0.06)	0.96(0.06)	0.96(0.07)	1.69(0.80)
	2G		0.97(0.07)	0.97(0.07)	0.97(0.07)	0.96(0.07)	0.96(0.07)	1.70(0.80)
	4G		1.02(0.10)	1.00(0.10)	0.99(0.10)	1.00(0.10)	0.99(0.10)	1.74(0.84)
	8G		1.00(0.10)	1.06(0.16)	1.05(0.16)	1.05(0.16)	1.06(0.16)	1.78(0.89)
	16G		1.00(0.10)	1.06(0.16)	1.16(0.26)	1.16(0.26)	1.15(0.26)	1.90(1.00)




test2:
														sec
		cyc-buf	2		4		8		16		32		64	
	splblk-size
	2M		23.42(0.03)	23.38(0.05)	23.46(0.11)	23.56(0.17)	23.56(0.23)	23.84(0.48)
	4M		23.35(0.03)	23.35(0.05)	23.56(0.11)	23.51(0.17)	23.60(0.22)	23.81(0.48)
	8M		23.34(0.03)	23.38(0.05)	23.46(0.11)	23.55(0.16)	23.58(0.22)	23.84(0.48)
	16M		23.39(0.03)	23.36(0.06)	23.42(0.11)	23.50(0.16)	23.59(0.23)	23.86(0.48)
	32M		23.43(0.03)	23.38(0.06)	23.47(0.12)	23.54(0.19)	23.58(0.23)	23.89(0.48)
	64M		23.42(0.04)	23.43(0.07)	23.47(0.12)	23.53(0.18)	23.59(0.23)	23.87(0.48)
	128M		23.45(0.07)	23.41(0.09)	23.52(0.14)	23.56(0.19)	23.59(0.23)	23.81(0.48)
	256M		23.47(0.12)	23.48(0.14)	23.50(0.14)	23.55(0.20)	23.62(0.23)	23.84(0.48)
	512M		23.48(0.18)	23.56(0.19)	23.55(0.19)	23.60(0.24)	23.71(0.32)	23.88(0.49)
	1G		23.74(0.30)	23.53(0.23)	23.64(0.31)	23.54(0.24)	23.65(0.27)	23.98(0.52)
	2G		23.78(0.48)	23.82(0.48)	23.84(0.48)	23.83(0.49)	23.89(0.52)	23.84(0.52)
	4G		23.91(0.52)	23.81(0.50)	23.92(0.50)	23.87(0.51)	23.88(0.54)	23.92(0.54)
	8G		23.80(0.50)	23.90(0.52)	23.82(0.49)	23.84(0.51)	23.90(0.54)	23.90(0.54)
	16G		23.85(0.51)	23.86(0.50)	23.85(0.51)	23.87(0.51)	23.95(0.54)	23.88(0.54)
		
		cyc-buf	128		256		512		1024		2048		4096	
	splblk-size
	2M		23.90(0.47)	23.84(0.48)	23.84(0.48)	24.06(0.47)	23.93(0.47)	49.00(25.46)
	4M		23.87(0.48)	23.84(0.48)	23.91(0.48)	23.93(0.47)	23.98(0.47)	49.00(25.46)
	8M		23.80(0.48)	23.81(0.48)	23.95(0.48)	23.95(0.47)	24.03(0.52)	49.05(25.45)
	16M		23.86(0.47)	23.84(0.48)	23.90(0.48)	23.93(0.47)	24.00(0.47)	48.95(25.41)
	32M		23.86(0.48)	23.82(0.48)	23.98(0.54)	23.99(0.47)	23.97(0.47)	49.03(25.47)
	64M		23.83(0.48)	23.79(0.48)	23.89(0.48)	24.02(0.47)	23.93(0.47)	48.96(25.49)
	128M		23.92(0.55)	23.91(0.48)	23.85(0.48)	23.98(0.47)	23.90(0.47)	48.98(25.50)
	256M		23.86(0.48)	23.88(0.48)	24.00(0.48)	24.17(0.47)	23.94(0.47)	49.01(25.50)
	512M		23.85(0.49)	23.89(0.55)	23.93(0.49)	23.91(0.48)	24.01(0.48)	49.12(25.54)
	1G		23.85(0.52)	23.84(0.52)	23.98(0.52)	23.98(0.51)	24.02(0.51)	49.28(25.73)
	2G		23.92(0.52)	23.87(0.52)	23.93(0.52)	24.11(0.52)	24.04(0.52)	49.27(25.77)
	4G		24.24(0.91)	24.31(0.91)	24.29(0.91)	24.37(0.90)	24.37(0.90)	50.01(26.49)
	8G		24.27(0.91)	24.94(1.61)	25.07(1.62)	25.11(1.60)	25.02(1.60)	51.24(27.77)
	16G		24.29(0.91)	24.98(1.63)	29.32(5.85)	29.34(5.83)	29.33(5.83)	53.91(30.43)	




test3:
														sec
		cyc-buf	2		4		8		16		32		64	
	splblk-size
	2M		23.34(0.03)	23.38(0.05)	23.46(0.11)	23.47(0.16)	23.55(0.22)	23.97(0.65)
	4M		23.38(0.03)	23.41(0.05)	23.46(0.11)	23.47(0.16)	23.56(0.22)	23.99(0.65)
	8M		23.33(0.03)	23.41(0.06)	23.50(0.11)	23.52(0.16)	23.54(0.23)	24.05(0.66)
	16M		23.38(0.03)	23.51(0.14)	23.46(0.12)	23.54(0.16)	23.57(0.23)	23.98(0.65)
	32M		23.34(0.03)	23.37(0.06)	23.47(0.12)	23.53(0.18)	23.56(0.23)	24.06(0.66)
	64M		23.36(0.06)	23.43(0.08)	23.52(0.13)	23.53(0.18)	23.63(0.23)	23.99(0.66)
	128M		23.47(0.07)	23.40(0.09)	23.54(0.15)	23.50(0.19)	23.57(0.23)	24.15(0.74)
	256M		23.48(0.13)	23.52(0.14)	23.50(0.15)	23.54(0.19)	23.56(0.23)	23.98(0.66)
	512M		23.49(0.18)	23.49(0.19)	23.60(0.20)	23.66(0.32)	23.64(0.23)	24.15(0.75)
	1G		23.58(0.23)	23.54(0.23)	23.62(0.24)	23.57(0.24)	23.64(0.27)	24.17(0.81)
	2G		24.00(0.65)	23.99(0.66)	24.01(0.67)	24.01(0.70)	24.13(0.80)	24.43(0.82)
	4G		24.04(0.66)	23.97(0.67)	24.05(0.71)	24.04(0.73)	24.15(0.82)	24.18(0.83)
	8G		23.97(0.66)	24.06(0.67)	23.98(0.68)	24.07(0.73)	24.13(0.81)	24.16(0.83)
	16G		24.05(0.66)	24.03(0.68)	24.02(0.68)	24.08(0.72)	24.18(0.82)	24.16(0.83)
	
		cyc-buf	128		256		512		1024		2048		4096	
	splblk-size
	2M		24.05(0.65)	24.04(0.65)	24.58(0.65)	24.15(0.64)	24.14(0.64)	49.05(25.46)
	4M		23.98(0.65)	24.01(0.65)	24.29(0.65)	24.20(0.64)	24.18(0.66)	49.04(25.46)
	8M		24.02(0.65)	24.03(0.65)	24.25(0.65)	24.26(0.70)	24.15(0.64)	48.98(25.44)
	16M		24.01(0.65)	24.01(0.65)	24.30(0.65)	24.19(0.64)	24.12(0.65)	48.99(25.45)
	32M		23.97(0.65)	24.06(0.73)	24.23(0.65)	24.17(0.64)	24.19(0.64)	48.97(25.50)
	64M		24.06(0.66)	24.07(0.66)	24.27(0.66)	24.16(0.65)	24.17(0.65)	48.98(25.49)
	128M		24.03(0.67)	24.00(0.67)	24.27(0.66)	24.22(0.66)	24.19(0.66)	48.98(25.48)
	256M		24.12(0.67)	23.99(0.67)	24.27(0.67)	24.17(0.66)	24.12(0.66)	49.04(25.49)
	512M		24.06(0.70)	24.08(0.70)	24.26(0.70)	24.14(0.71)	24.19(0.70)	49.13(25.64)
	1G		24.20(0.82)	24.13(0.81)	24.36(0.81)	24.31(0.80)	24.33(0.81)	49.28(25.75)
	2G		24.19(0.81)	24.22(0.81)	24.37(0.81)	24.29(0.80)	24.28(0.82)	49.30(25.78)
	4G		25.29(1.90)	25.26(1.91)	25.49(1.91)	25.41(1.89)	25.50(1.90)	49.99(26.45)
	8G		25.33(1.90)	26.60(3.23)	26.87(3.21)	26.71(3.23)	26.64(3.22)	51.27(27.73)
	16G		25.28(1.90)	26.52(3.21)	29.47(5.86)	29.34(5.84)	29.38(5.86)	53.99(30.40)


	
Thanks
Zhou Wenjian

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-22  1:52   ` "Zhou, Wenjian/周文剑"
@ 2014-10-27  1:08     ` "Zhou, Wenjian/周文剑"
  2014-10-27  7:51     ` Atsushi Kumagai
  1 sibling, 0 replies; 21+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2014-10-27  1:08 UTC (permalink / raw)
  To: kexec; +Cc: Atsushi Kumagai

ping...

On 10/22/2014 09:52 AM, "Zhou, Wenjian/周文剑" wrote:
> On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
>> Hello,
>>
>> The code looks good to me, thanks Zhou.
>> Now, I have a question on performance.
>>
>>> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>>
>>> This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
>>> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
>>> The tables below show the performence with different size of cyclic-buffer and splitblock.
>>> The test is executed on the machine having 128G memory.
>>>
>>> the value is total time (including first pass and second pass).
>>> the value in brackets is the time of second pass.
>>
>> Do you have any idea why the time of second pass is much larger when
>> the splitblock-size is 2G ? I worry about the scalability.
>>
> Hello,
>
> Since the previous machine can't be used for some reasons,I test several times using latest code
> in others, but that never happened. It seems that all things are right. Tests are executed in two machines(server,pc).
> Tests are based on:
> test1:
> machine: server
> crashkernel: 512M
> vmcore: /proc/vmcore (128G)
>
>
> test2:
> machine: pc
> crashkernel: 256M
> vmcore: vmcore dumped from the server
>
> test3:
> machine: pc
> crashkernel: 128M
> vmcore: vmcore dumped from the server
>
>
>
> test1:
> sec
> cyc-buf 2 4 8 16 32 64
> splblk-size
> 2M 1.53(0.00) 1.24(0.00) 1.08(0.01) 0.99(0.01) 0.95(0.01) 0.95(0.04)
> 4M 1.52(0.00) 1.24(0.00) 1.08(0.01) 0.99(0.01) 0.95(0.01) 0.96(0.04)
> 8M 1.53(0.00) 1.23(0.00) 1.08(0.01) 0.99(0.01) 0.95(0.01) 0.95(0.04)
> 16M 1.53(0.00) 1.24(0.00) 1.08(0.01) 1.00(0.01) 0.94(0.01) 0.96(0.04)
> 32M 1.52(0.00) 1.25(0.00) 1.08(0.01) 0.99(0.01) 0.94(0.01) 0.96(0.05)
> 64M 1.52(0.00) 1.23(0.00) 1.07(0.01) 0.99(0.01) 0.97(0.01) 0.96(0.05)
> 128M 1.54(0.01) 1.25(0.01) 1.08(0.01) 0.99(0.01) 0.96(0.02) 0.96(0.05)
> 256M 1.53(0.01) 1.25(0.01) 1.08(0.01) 0.99(0.01) 0.95(0.02) 0.96(0.05)
> 512M 1.53(0.01) 1.25(0.01) 1.07(0.01) 1.01(0.02) 0.95(0.02) 0.96(0.05)
> 1G 1.54(0.02) 1.25(0.02) 1.08(0.02) 1.00(0.02) 0.96(0.03) 0.97(0.06)
> 2G 1.59(0.07) 1.30(0.06) 1.11(0.05) 1.04(0.06) 1.00(0.07) 0.98(0.07)
> 4G 1.60(0.08) 1.31(0.07) 1.14(0.07) 1.05(0.07) 1.02(0.08) 1.00(0.08)
> 8G 1.60(0.08) 1.31(0.07) 1.14(0.07) 1.05(0.07) 1.02(0.08) 0.99(0.08)
> 16G 1.60(0.08) 1.30(0.07) 1.14(0.07) 1.05(0.07) 1.02(0.08) 0.99(0.08)
>
> cyc-buf 128 256 512 1024 2048 4096
> splblk-size
> 2M 0.95(0.04) 0.94(0.04) 0.94(0.04) 0.95(0.05) 0.94(0.05) 1.68(0.78)
> 4M 0.95(0.04) 0.94(0.04) 0.94(0.04) 0.95(0.05) 0.94(0.05) 1.67(0.78)
> 8M 0.94(0.04) 0.94(0.04) 0.94(0.04) 0.95(0.05) 0.95(0.05) 1.68(0.78)
> 16M 0.96(0.05) 0.94(0.04) 0.95(0.05) 0.94(0.04) 0.94(0.05) 1.68(0.78)
> 32M 0.95(0.05) 0.94(0.05) 0.94(0.05) 0.94(0.05) 0.94(0.05) 1.68(0.78)
> 64M 0.96(0.05) 0.95(0.05) 0.94(0.05) 0.94(0.05) 0.95(0.05) 1.67(0.78)
> 128M 0.96(0.05) 0.95(0.05) 0.95(0.05) 0.95(0.05) 0.94(0.05) 1.67(0.78)
> 256M 0.95(0.05) 0.95(0.05) 0.95(0.05) 0.94(0.05) 0.94(0.05) 1.67(0.78)
> 512M 0.96(0.05) 0.95(0.05) 0.94(0.05) 0.94(0.05) 0.95(0.05) 1.68(0.79)
> 1G 0.96(0.06) 0.96(0.06) 0.97(0.06) 0.96(0.06) 0.96(0.07) 1.69(0.80)
> 2G 0.97(0.07) 0.97(0.07) 0.97(0.07) 0.96(0.07) 0.96(0.07) 1.70(0.80)
> 4G 1.02(0.10) 1.00(0.10) 0.99(0.10) 1.00(0.10) 0.99(0.10) 1.74(0.84)
> 8G 1.00(0.10) 1.06(0.16) 1.05(0.16) 1.05(0.16) 1.06(0.16) 1.78(0.89)
> 16G 1.00(0.10) 1.06(0.16) 1.16(0.26) 1.16(0.26) 1.15(0.26) 1.90(1.00)
>
>
>
>
> test2:
> sec
> cyc-buf 2 4 8 16 32 64
> splblk-size
> 2M 23.42(0.03) 23.38(0.05) 23.46(0.11) 23.56(0.17) 23.56(0.23) 23.84(0.48)
> 4M 23.35(0.03) 23.35(0.05) 23.56(0.11) 23.51(0.17) 23.60(0.22) 23.81(0.48)
> 8M 23.34(0.03) 23.38(0.05) 23.46(0.11) 23.55(0.16) 23.58(0.22) 23.84(0.48)
> 16M 23.39(0.03) 23.36(0.06) 23.42(0.11) 23.50(0.16) 23.59(0.23) 23.86(0.48)
> 32M 23.43(0.03) 23.38(0.06) 23.47(0.12) 23.54(0.19) 23.58(0.23) 23.89(0.48)
> 64M 23.42(0.04) 23.43(0.07) 23.47(0.12) 23.53(0.18) 23.59(0.23) 23.87(0.48)
> 128M 23.45(0.07) 23.41(0.09) 23.52(0.14) 23.56(0.19) 23.59(0.23) 23.81(0.48)
> 256M 23.47(0.12) 23.48(0.14) 23.50(0.14) 23.55(0.20) 23.62(0.23) 23.84(0.48)
> 512M 23.48(0.18) 23.56(0.19) 23.55(0.19) 23.60(0.24) 23.71(0.32) 23.88(0.49)
> 1G 23.74(0.30) 23.53(0.23) 23.64(0.31) 23.54(0.24) 23.65(0.27) 23.98(0.52)
> 2G 23.78(0.48) 23.82(0.48) 23.84(0.48) 23.83(0.49) 23.89(0.52) 23.84(0.52)
> 4G 23.91(0.52) 23.81(0.50) 23.92(0.50) 23.87(0.51) 23.88(0.54) 23.92(0.54)
> 8G 23.80(0.50) 23.90(0.52) 23.82(0.49) 23.84(0.51) 23.90(0.54) 23.90(0.54)
> 16G 23.85(0.51) 23.86(0.50) 23.85(0.51) 23.87(0.51) 23.95(0.54) 23.88(0.54)
>
> cyc-buf 128 256 512 1024 2048 4096
> splblk-size
> 2M 23.90(0.47) 23.84(0.48) 23.84(0.48) 24.06(0.47) 23.93(0.47) 49.00(25.46)
> 4M 23.87(0.48) 23.84(0.48) 23.91(0.48) 23.93(0.47) 23.98(0.47) 49.00(25.46)
> 8M 23.80(0.48) 23.81(0.48) 23.95(0.48) 23.95(0.47) 24.03(0.52) 49.05(25.45)
> 16M 23.86(0.47) 23.84(0.48) 23.90(0.48) 23.93(0.47) 24.00(0.47) 48.95(25.41)
> 32M 23.86(0.48) 23.82(0.48) 23.98(0.54) 23.99(0.47) 23.97(0.47) 49.03(25.47)
> 64M 23.83(0.48) 23.79(0.48) 23.89(0.48) 24.02(0.47) 23.93(0.47) 48.96(25.49)
> 128M 23.92(0.55) 23.91(0.48) 23.85(0.48) 23.98(0.47) 23.90(0.47) 48.98(25.50)
> 256M 23.86(0.48) 23.88(0.48) 24.00(0.48) 24.17(0.47) 23.94(0.47) 49.01(25.50)
> 512M 23.85(0.49) 23.89(0.55) 23.93(0.49) 23.91(0.48) 24.01(0.48) 49.12(25.54)
> 1G 23.85(0.52) 23.84(0.52) 23.98(0.52) 23.98(0.51) 24.02(0.51) 49.28(25.73)
> 2G 23.92(0.52) 23.87(0.52) 23.93(0.52) 24.11(0.52) 24.04(0.52) 49.27(25.77)
> 4G 24.24(0.91) 24.31(0.91) 24.29(0.91) 24.37(0.90) 24.37(0.90) 50.01(26.49)
> 8G 24.27(0.91) 24.94(1.61) 25.07(1.62) 25.11(1.60) 25.02(1.60) 51.24(27.77)
> 16G 24.29(0.91) 24.98(1.63) 29.32(5.85) 29.34(5.83) 29.33(5.83) 53.91(30.43)
>
>
>
>
> test3:
> sec
> cyc-buf 2 4 8 16 32 64
> splblk-size
> 2M 23.34(0.03) 23.38(0.05) 23.46(0.11) 23.47(0.16) 23.55(0.22) 23.97(0.65)
> 4M 23.38(0.03) 23.41(0.05) 23.46(0.11) 23.47(0.16) 23.56(0.22) 23.99(0.65)
> 8M 23.33(0.03) 23.41(0.06) 23.50(0.11) 23.52(0.16) 23.54(0.23) 24.05(0.66)
> 16M 23.38(0.03) 23.51(0.14) 23.46(0.12) 23.54(0.16) 23.57(0.23) 23.98(0.65)
> 32M 23.34(0.03) 23.37(0.06) 23.47(0.12) 23.53(0.18) 23.56(0.23) 24.06(0.66)
> 64M 23.36(0.06) 23.43(0.08) 23.52(0.13) 23.53(0.18) 23.63(0.23) 23.99(0.66)
> 128M 23.47(0.07) 23.40(0.09) 23.54(0.15) 23.50(0.19) 23.57(0.23) 24.15(0.74)
> 256M 23.48(0.13) 23.52(0.14) 23.50(0.15) 23.54(0.19) 23.56(0.23) 23.98(0.66)
> 512M 23.49(0.18) 23.49(0.19) 23.60(0.20) 23.66(0.32) 23.64(0.23) 24.15(0.75)
> 1G 23.58(0.23) 23.54(0.23) 23.62(0.24) 23.57(0.24) 23.64(0.27) 24.17(0.81)
> 2G 24.00(0.65) 23.99(0.66) 24.01(0.67) 24.01(0.70) 24.13(0.80) 24.43(0.82)
> 4G 24.04(0.66) 23.97(0.67) 24.05(0.71) 24.04(0.73) 24.15(0.82) 24.18(0.83)
> 8G 23.97(0.66) 24.06(0.67) 23.98(0.68) 24.07(0.73) 24.13(0.81) 24.16(0.83)
> 16G 24.05(0.66) 24.03(0.68) 24.02(0.68) 24.08(0.72) 24.18(0.82) 24.16(0.83)
>
> cyc-buf 128 256 512 1024 2048 4096
> splblk-size
> 2M 24.05(0.65) 24.04(0.65) 24.58(0.65) 24.15(0.64) 24.14(0.64) 49.05(25.46)
> 4M 23.98(0.65) 24.01(0.65) 24.29(0.65) 24.20(0.64) 24.18(0.66) 49.04(25.46)
> 8M 24.02(0.65) 24.03(0.65) 24.25(0.65) 24.26(0.70) 24.15(0.64) 48.98(25.44)
> 16M 24.01(0.65) 24.01(0.65) 24.30(0.65) 24.19(0.64) 24.12(0.65) 48.99(25.45)
> 32M 23.97(0.65) 24.06(0.73) 24.23(0.65) 24.17(0.64) 24.19(0.64) 48.97(25.50)
> 64M 24.06(0.66) 24.07(0.66) 24.27(0.66) 24.16(0.65) 24.17(0.65) 48.98(25.49)
> 128M 24.03(0.67) 24.00(0.67) 24.27(0.66) 24.22(0.66) 24.19(0.66) 48.98(25.48)
> 256M 24.12(0.67) 23.99(0.67) 24.27(0.67) 24.17(0.66) 24.12(0.66) 49.04(25.49)
> 512M 24.06(0.70) 24.08(0.70) 24.26(0.70) 24.14(0.71) 24.19(0.70) 49.13(25.64)
> 1G 24.20(0.82) 24.13(0.81) 24.36(0.81) 24.31(0.80) 24.33(0.81) 49.28(25.75)
> 2G 24.19(0.81) 24.22(0.81) 24.37(0.81) 24.29(0.80) 24.28(0.82) 49.30(25.78)
> 4G 25.29(1.90) 25.26(1.91) 25.49(1.91) 25.41(1.89) 25.50(1.90) 49.99(26.45)
> 8G 25.33(1.90) 26.60(3.23) 26.87(3.21) 26.71(3.23) 26.64(3.22) 51.27(27.73)
> 16G 25.28(1.90) 26.52(3.21) 29.47(5.86) 29.34(5.84) 29.38(5.86) 53.99(30.40)
>
>
>
> Thanks
> Zhou Wenjian
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-17  3:50 ` [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Atsushi Kumagai
  2014-10-22  1:52   ` "Zhou, Wenjian/周文剑"
@ 2014-10-27  6:19   ` "Zhou, Wenjian/周文剑"
  1 sibling, 0 replies; 21+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2014-10-27  6:19 UTC (permalink / raw)
  To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org

On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
> Do you have any idea why the time of second pass is much larger when
> the splitblock-size is 2G ? I worry about the scalability.

I think I have made some mistakes during the test.

Thanks
Zhou Wenjian

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-22  1:52   ` "Zhou, Wenjian/周文剑"
  2014-10-27  1:08     ` "Zhou, Wenjian/周文剑"
@ 2014-10-27  7:51     ` Atsushi Kumagai
  2014-10-28  6:24       ` HATAYAMA Daisuke
  1 sibling, 1 reply; 21+ messages in thread
From: Atsushi Kumagai @ 2014-10-27  7:51 UTC (permalink / raw)
  To: zhouwj-fnst@cn.fujitsu.com; +Cc: kexec@lists.infradead.org

Hello Zhou,

>On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
>> Hello,
>>
>> The code looks good to me, thanks Zhou.
>> Now, I have a question on performance.
>>
>>> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>>
>>> This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
>>> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
>>> The tables below show the performence with different size of cyclic-buffer and splitblock.
>>> The test is executed on the machine having 128G memory.
>>>
>>> the value is total time (including first pass and second pass).
>>> the value in brackets is the time of second pass.
>>
>> Do you have any idea why the time of second pass is much larger when
>> the splitblock-size is 2G ? I worry about the scalability.
>>
>Hello,
>
>	Since the previous machine can't be used for some reasons,I test several times using latest code
>in others, but that never happened. It seems that all things are right. Tests are executed in two machines(server,pc).
>Tests are based on:

Well...OK, I'll take that as an issue specific to that machine
(or your mistakes as you said).
Now I have another question.

calculate_end_pfn_by_splitblock():
	...
        /* deal with incomplete splitblock */
        if (pfn_needed_by_per_dumpfile < 0) {
                --*current_splitblock;
                splitblock_inner -= splitblock->entry_size;
                end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
                *current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
                pfn_needed_by_per_dumpfile += read_value_from_splitblock_table(splitblock_inner);
                end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
                                                     CURRENT_SPLITBLOCK_PFN_NUM + splitblock->page_per_splitblock,
                                                     end_pfn,pfn_needed_by_per_dumpfile);
        }

This block causes the re-scanning for the cycle corresponding to the
current_splitblock, so the larger cyc-buf, the longer the time takes.
If cyc-buf is 4096 (this means the number of cycle is 1), the whole page
scanning will be done in the second pass. Actually, the performance when
cyc-buf=4096 was so bad.

Is this process necessary ? I think splitting splitblocks is overkill
because I understood that splblk-size is the granularity of the
fairness I/O, tuning splblk-size is a trade off between fairness and
memory usage.
However, there is no advantage to reducing splblk-size in the current
implementation, it just consumes large amounts of memory.
If we remove the process, we can avoid the whole page scanning in
the second pass and reducing splblk-size will be meaningful as I
expected.


Thanks
Atsushi Kumagai


>		test1:
>			machine:	server
>			crashkernel:	512M
>			vmcore:		/proc/vmcore (128G)
>
>
>		test2:
>			machine:	pc
>			crashkernel:	256M
>			vmcore:		vmcore dumped from the server
>
>		test3:
>			machine:	pc
>			crashkernel:	128M
>			vmcore:		vmcore dumped from the server
>
>
>
>test1:
>													      sec
>		cyc-buf	2		4		8		16		32		64
>	splblk-size
>	2M		1.53(0.00)	1.24(0.00)	1.08(0.01)	0.99(0.01)	0.95(0.01)	0.95(0.04)
>	4M		1.52(0.00)	1.24(0.00)	1.08(0.01)	0.99(0.01)	0.95(0.01)	0.96(0.04)
>	8M		1.53(0.00)	1.23(0.00)	1.08(0.01)	0.99(0.01)	0.95(0.01)	0.95(0.04)
>	16M		1.53(0.00)	1.24(0.00)	1.08(0.01)	1.00(0.01)	0.94(0.01)	0.96(0.04)
>	32M		1.52(0.00)	1.25(0.00)	1.08(0.01)	0.99(0.01)	0.94(0.01)	0.96(0.05)
>	64M		1.52(0.00)	1.23(0.00)	1.07(0.01)	0.99(0.01)	0.97(0.01)	0.96(0.05)
>	128M		1.54(0.01)	1.25(0.01)	1.08(0.01)	0.99(0.01)	0.96(0.02)	0.96(0.05)
>	256M		1.53(0.01)	1.25(0.01)	1.08(0.01)	0.99(0.01)	0.95(0.02)	0.96(0.05)
>	512M		1.53(0.01)	1.25(0.01)	1.07(0.01)	1.01(0.02)	0.95(0.02)	0.96(0.05)
>	1G		1.54(0.02)	1.25(0.02)	1.08(0.02)	1.00(0.02)	0.96(0.03)	0.97(0.06)
>	2G		1.59(0.07)	1.30(0.06)	1.11(0.05)	1.04(0.06)	1.00(0.07)	0.98(0.07)
>	4G		1.60(0.08)	1.31(0.07)	1.14(0.07)	1.05(0.07)	1.02(0.08)	1.00(0.08)
>	8G		1.60(0.08)	1.31(0.07)	1.14(0.07)	1.05(0.07)	1.02(0.08)	0.99(0.08)
>	16G		1.60(0.08)	1.30(0.07)	1.14(0.07)	1.05(0.07)	1.02(0.08)	0.99(0.08)
>
>		cyc-buf	128		256		512		1024		2048		4096
>	splblk-size
>	2M		0.95(0.04)	0.94(0.04)	0.94(0.04)	0.95(0.05)	0.94(0.05)	1.68(0.78)
>	4M		0.95(0.04)	0.94(0.04)	0.94(0.04)	0.95(0.05)	0.94(0.05)	1.67(0.78)
>	8M		0.94(0.04)	0.94(0.04)	0.94(0.04)	0.95(0.05)	0.95(0.05)	1.68(0.78)
>	16M		0.96(0.05)	0.94(0.04)	0.95(0.05)	0.94(0.04)	0.94(0.05)	1.68(0.78)
>	32M		0.95(0.05)	0.94(0.05)	0.94(0.05)	0.94(0.05)	0.94(0.05)	1.68(0.78)
>	64M		0.96(0.05)	0.95(0.05)	0.94(0.05)	0.94(0.05)	0.95(0.05)	1.67(0.78)
>	128M		0.96(0.05)	0.95(0.05)	0.95(0.05)	0.95(0.05)	0.94(0.05)	1.67(0.78)
>	256M		0.95(0.05)	0.95(0.05)	0.95(0.05)	0.94(0.05)	0.94(0.05)	1.67(0.78)
>	512M		0.96(0.05)	0.95(0.05)	0.94(0.05)	0.94(0.05)	0.95(0.05)	1.68(0.79)
>	1G		0.96(0.06)	0.96(0.06)	0.97(0.06)	0.96(0.06)	0.96(0.07)	1.69(0.80)
>	2G		0.97(0.07)	0.97(0.07)	0.97(0.07)	0.96(0.07)	0.96(0.07)	1.70(0.80)
>	4G		1.02(0.10)	1.00(0.10)	0.99(0.10)	1.00(0.10)	0.99(0.10)	1.74(0.84)
>	8G		1.00(0.10)	1.06(0.16)	1.05(0.16)	1.05(0.16)	1.06(0.16)	1.78(0.89)
>	16G		1.00(0.10)	1.06(0.16)	1.16(0.26)	1.16(0.26)	1.15(0.26)	1.90(1.00)
>
>
>
>
>test2:
>														sec
>		cyc-buf	2		4		8		16		32		64
>	splblk-size
>	2M		23.42(0.03)	23.38(0.05)	23.46(0.11)	23.56(0.17)	23.56(0.23)	23.84(0.48)
>	4M		23.35(0.03)	23.35(0.05)	23.56(0.11)	23.51(0.17)	23.60(0.22)	23.81(0.48)
>	8M		23.34(0.03)	23.38(0.05)	23.46(0.11)	23.55(0.16)	23.58(0.22)	23.84(0.48)
>	16M		23.39(0.03)	23.36(0.06)	23.42(0.11)	23.50(0.16)	23.59(0.23)	23.86(0.48)
>	32M		23.43(0.03)	23.38(0.06)	23.47(0.12)	23.54(0.19)	23.58(0.23)	23.89(0.48)
>	64M		23.42(0.04)	23.43(0.07)	23.47(0.12)	23.53(0.18)	23.59(0.23)	23.87(0.48)
>	128M		23.45(0.07)	23.41(0.09)	23.52(0.14)	23.56(0.19)	23.59(0.23)	23.81(0.48)
>	256M		23.47(0.12)	23.48(0.14)	23.50(0.14)	23.55(0.20)	23.62(0.23)	23.84(0.48)
>	512M		23.48(0.18)	23.56(0.19)	23.55(0.19)	23.60(0.24)	23.71(0.32)	23.88(0.49)
>	1G		23.74(0.30)	23.53(0.23)	23.64(0.31)	23.54(0.24)	23.65(0.27)	23.98(0.52)
>	2G		23.78(0.48)	23.82(0.48)	23.84(0.48)	23.83(0.49)	23.89(0.52)	23.84(0.52)
>	4G		23.91(0.52)	23.81(0.50)	23.92(0.50)	23.87(0.51)	23.88(0.54)	23.92(0.54)
>	8G		23.80(0.50)	23.90(0.52)	23.82(0.49)	23.84(0.51)	23.90(0.54)	23.90(0.54)
>	16G		23.85(0.51)	23.86(0.50)	23.85(0.51)	23.87(0.51)	23.95(0.54)	23.88(0.54)
>
>		cyc-buf	128		256		512		1024		2048		4096
>	splblk-size
>	2M		23.90(0.47)	23.84(0.48)	23.84(0.48)	24.06(0.47)	23.93(0.47)	49.00(25.46)
>	4M		23.87(0.48)	23.84(0.48)	23.91(0.48)	23.93(0.47)	23.98(0.47)	49.00(25.46)
>	8M		23.80(0.48)	23.81(0.48)	23.95(0.48)	23.95(0.47)	24.03(0.52)	49.05(25.45)
>	16M		23.86(0.47)	23.84(0.48)	23.90(0.48)	23.93(0.47)	24.00(0.47)	48.95(25.41)
>	32M		23.86(0.48)	23.82(0.48)	23.98(0.54)	23.99(0.47)	23.97(0.47)	49.03(25.47)
>	64M		23.83(0.48)	23.79(0.48)	23.89(0.48)	24.02(0.47)	23.93(0.47)	48.96(25.49)
>	128M		23.92(0.55)	23.91(0.48)	23.85(0.48)	23.98(0.47)	23.90(0.47)	48.98(25.50)
>	256M		23.86(0.48)	23.88(0.48)	24.00(0.48)	24.17(0.47)	23.94(0.47)	49.01(25.50)
>	512M		23.85(0.49)	23.89(0.55)	23.93(0.49)	23.91(0.48)	24.01(0.48)	49.12(25.54)
>	1G		23.85(0.52)	23.84(0.52)	23.98(0.52)	23.98(0.51)	24.02(0.51)	49.28(25.73)
>	2G		23.92(0.52)	23.87(0.52)	23.93(0.52)	24.11(0.52)	24.04(0.52)	49.27(25.77)
>	4G		24.24(0.91)	24.31(0.91)	24.29(0.91)	24.37(0.90)	24.37(0.90)	50.01(26.49)
>	8G		24.27(0.91)	24.94(1.61)	25.07(1.62)	25.11(1.60)	25.02(1.60)	51.24(27.77)
>	16G		24.29(0.91)	24.98(1.63)	29.32(5.85)	29.34(5.83)	29.33(5.83)	53.91(30.43)
>
>
>test3:
>														sec
>		cyc-buf	2		4		8		16		32		64
>	splblk-size
>	2M		23.34(0.03)	23.38(0.05)	23.46(0.11)	23.47(0.16)	23.55(0.22)	23.97(0.65)
>	4M		23.38(0.03)	23.41(0.05)	23.46(0.11)	23.47(0.16)	23.56(0.22)	23.99(0.65)
>	8M		23.33(0.03)	23.41(0.06)	23.50(0.11)	23.52(0.16)	23.54(0.23)	24.05(0.66)
>	16M		23.38(0.03)	23.51(0.14)	23.46(0.12)	23.54(0.16)	23.57(0.23)	23.98(0.65)
>	32M		23.34(0.03)	23.37(0.06)	23.47(0.12)	23.53(0.18)	23.56(0.23)	24.06(0.66)
>	64M		23.36(0.06)	23.43(0.08)	23.52(0.13)	23.53(0.18)	23.63(0.23)	23.99(0.66)
>	128M		23.47(0.07)	23.40(0.09)	23.54(0.15)	23.50(0.19)	23.57(0.23)	24.15(0.74)
>	256M		23.48(0.13)	23.52(0.14)	23.50(0.15)	23.54(0.19)	23.56(0.23)	23.98(0.66)
>	512M		23.49(0.18)	23.49(0.19)	23.60(0.20)	23.66(0.32)	23.64(0.23)	24.15(0.75)
>	1G		23.58(0.23)	23.54(0.23)	23.62(0.24)	23.57(0.24)	23.64(0.27)	24.17(0.81)
>	2G		24.00(0.65)	23.99(0.66)	24.01(0.67)	24.01(0.70)	24.13(0.80)	24.43(0.82)
>	4G		24.04(0.66)	23.97(0.67)	24.05(0.71)	24.04(0.73)	24.15(0.82)	24.18(0.83)
>	8G		23.97(0.66)	24.06(0.67)	23.98(0.68)	24.07(0.73)	24.13(0.81)	24.16(0.83)
>	16G		24.05(0.66)	24.03(0.68)	24.02(0.68)	24.08(0.72)	24.18(0.82)	24.16(0.83)
>
>		cyc-buf	128		256		512		1024		2048		4096
>	splblk-size
>	2M		24.05(0.65)	24.04(0.65)	24.58(0.65)	24.15(0.64)	24.14(0.64)	49.05(25.46)
>	4M		23.98(0.65)	24.01(0.65)	24.29(0.65)	24.20(0.64)	24.18(0.66)	49.04(25.46)
>	8M		24.02(0.65)	24.03(0.65)	24.25(0.65)	24.26(0.70)	24.15(0.64)	48.98(25.44)
>	16M		24.01(0.65)	24.01(0.65)	24.30(0.65)	24.19(0.64)	24.12(0.65)	48.99(25.45)
>	32M		23.97(0.65)	24.06(0.73)	24.23(0.65)	24.17(0.64)	24.19(0.64)	48.97(25.50)
>	64M		24.06(0.66)	24.07(0.66)	24.27(0.66)	24.16(0.65)	24.17(0.65)	48.98(25.49)
>	128M		24.03(0.67)	24.00(0.67)	24.27(0.66)	24.22(0.66)	24.19(0.66)	48.98(25.48)
>	256M		24.12(0.67)	23.99(0.67)	24.27(0.67)	24.17(0.66)	24.12(0.66)	49.04(25.49)
>	512M		24.06(0.70)	24.08(0.70)	24.26(0.70)	24.14(0.71)	24.19(0.70)	49.13(25.64)
>	1G		24.20(0.82)	24.13(0.81)	24.36(0.81)	24.31(0.80)	24.33(0.81)	49.28(25.75)
>	2G		24.19(0.81)	24.22(0.81)	24.37(0.81)	24.29(0.80)	24.28(0.82)	49.30(25.78)
>	4G		25.29(1.90)	25.26(1.91)	25.49(1.91)	25.41(1.89)	25.50(1.90)	49.99(26.45)
>	8G		25.33(1.90)	26.60(3.23)	26.87(3.21)	26.71(3.23)	26.64(3.22)	51.27(27.73)
>	16G		25.28(1.90)	26.52(3.21)	29.47(5.86)	29.34(5.84)	29.38(5.86)	53.99(30.40)
>
>
>
>Thanks
>Zhou Wenjian
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-27  7:51     ` Atsushi Kumagai
@ 2014-10-28  6:24       ` HATAYAMA Daisuke
  2014-10-28  6:32         ` qiaonuohan
  0 siblings, 1 reply; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-28  6:24 UTC (permalink / raw)
  To: kumagai-atsushi; +Cc: zhouwj-fnst, kexec

From: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Subject: RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
Date: Mon, 27 Oct 2014 07:51:56 +0000

> Hello Zhou,
> 
>>On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
>>> Hello,
>>>
>>> The code looks good to me, thanks Zhou.
>>> Now, I have a question on performance.
>>>
>>>> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>>>
>>>> This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
>>>> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
>>>> The tables below show the performence with different size of cyclic-buffer and splitblock.
>>>> The test is executed on the machine having 128G memory.
>>>>
>>>> the value is total time (including first pass and second pass).
>>>> the value in brackets is the time of second pass.
>>>
>>> Do you have any idea why the time of second pass is much larger when
>>> the splitblock-size is 2G ? I worry about the scalability.
>>>
>>Hello,
>>
>>	Since the previous machine can't be used for some reasons,I test several times using latest code
>>in others, but that never happened. It seems that all things are right. Tests are executed in two machines(server,pc).
>>Tests are based on:
> 
> Well...OK, I'll take that as an issue specific to that machine
> (or your mistakes as you said).
> Now I have another question.
> 
> calculate_end_pfn_by_splitblock():
> 	...
>         /* deal with incomplete splitblock */
>         if (pfn_needed_by_per_dumpfile < 0) {
>                 --*current_splitblock;
>                 splitblock_inner -= splitblock->entry_size;
>                 end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
>                 *current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
>                 pfn_needed_by_per_dumpfile += read_value_from_splitblock_table(splitblock_inner);
>                 end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
>                                                      CURRENT_SPLITBLOCK_PFN_NUM + splitblock->page_per_splitblock,
>                                                      end_pfn,pfn_needed_by_per_dumpfile);
>         }
> 
> This block causes the re-scanning for the cycle corresponding to the
> current_splitblock, so the larger cyc-buf, the longer the time takes.
> If cyc-buf is 4096 (this means the number of cycle is 1), the whole page
> scanning will be done in the second pass. Actually, the performance when
> cyc-buf=4096 was so bad.
> 
> Is this process necessary ? I think splitting splitblocks is overkill
> because I understood that splblk-size is the granularity of the
> fairness I/O, tuning splblk-size is a trade off between fairness and
> memory usage.
> However, there is no advantage to reducing splblk-size in the current
> implementation, it just consumes large amounts of memory.
> If we remove the process, we can avoid the whole page scanning in
> the second pass and reducing splblk-size will be meaningful as I
> expected.
> 

Yes, I don't think this rescan works with this splitblock method,
too. The idea of this splitblock method is to reduce the number of
filitering processing from 3-times to 2-times at the expence of at
most splitblock-size difference of each dump file. Doing rescan here
doesn't fit to the idea.

--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/5] Add support for splitblock
  2014-10-13  9:34 ` [PATCH v2 1/5] Add support for splitblock Zhou Wenjian
@ 2014-10-28  6:30   ` HATAYAMA Daisuke
  0 siblings, 0 replies; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-28  6:30 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

From: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
Subject: [PATCH v2 1/5] Add support for splitblock
Date: Mon, 13 Oct 2014 17:34:22 +0800

> When --split option is specified, fair I/O workloads shoud be assigned
> for each process. So the start and end pfn of each dumpfile should be
> calculated with excluding unnecessary pages. However, it costs a lot of
> time to execute excluding for the whole memory. That is why struct
> SplitBlock exists. Struct SplitBlock is designed to manage memory, mainly
> for recording the number of dumpable pages. We can use the number of
> dumpable pages to calculate start and end pfn instead of execute excluding
> for the whole memory.
> 
> The char array *table in struct SplitBlock is used to record the number of
> dumpable pages.
> The table entry size is calculated as
> 			divideup(log2(splitblock_size / page_size), 8) bytes
> The table entry size is calculated, so that the
> space table taken will be small enough. And the code will also have a
> good performence when the number of pages in one splitblock is big enough.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
> ---
>  makedumpfile.c |   23 +++++++++++++++++++++++
>  makedumpfile.h |   14 ++++++++++++++
>  2 files changed, 37 insertions(+), 0 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index b4d43d8..95d553c 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -34,6 +34,7 @@ struct srcfile_table	srcfile_table;
>  
>  struct vm_table		vt = { 0 };
>  struct DumpInfo		*info = NULL;
> +struct SplitBlock		*splitblock = NULL;
>  
>  char filename_stdout[] = FILENAME_STDOUT;
>  
> @@ -5685,6 +5686,28 @@ out:
>  	return ret;
>  }
>  
> +/*
> + * cyclic_split mode:
> + *	manage memory by splitblocks,
> + *	divide memory into splitblocks
> + *	use splitblock_table to record numbers of dumpable pages in each splitblock
> + */
> +
> +//calculate entry size based on the amount of pages in one splitblock

Please use /* */ style in comment just as other parts of makedumpfile
source codes.

> +int
> +calculate_entry_size(void){

Please add linebreak.

calculate_entry_size(void)
{

> +	int entry_num = 1, count = 1;
> +	int entry_size;

Please add linebreak.

> +	while (entry_num < splitblock->page_per_splitblock){
> +		entry_num = entry_num << 1;
> +		count++;
> +	}
> +	entry_size = count/BITPERBYTE;
> +	if (count %BITPERBYTE)
> +		entry_size++;
> +	return entry_size;
> +}
> +
>  mdf_pfn_t
>  get_num_dumpable(void)
>  {
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 96830b0..98b8404 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -1168,10 +1168,24 @@ struct DumpInfo {
>  	 */
>  	int (*page_is_buddy)(unsigned long flags, unsigned int _mapcount,
>  			     unsigned long private, unsigned int _count);
> +	/*
> +	 *for cyclic_splitting mode, setup splitblock_size
> +	 */
> +	long long splitblock_size;
>  };
>  extern struct DumpInfo		*info;
>  
>  /*
> + *for cyclic_splitting mode,Manage memory by splitblock
> + */
> +struct SplitBlock{
> +        char *table;
> +        long long num;
> +        long long page_per_splitblock;
> +        int entry_size;                 //counted by byte

Please use /* */.

> +};
> +
> +/*
>   * kernel VM-related data
>   */
>  struct vm_table {
> -- 
> 1.7.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-28  6:24       ` HATAYAMA Daisuke
@ 2014-10-28  6:32         ` qiaonuohan
  2014-10-30  0:21           ` HATAYAMA Daisuke
  0 siblings, 1 reply; 21+ messages in thread
From: qiaonuohan @ 2014-10-28  6:32 UTC (permalink / raw)
  To: HATAYAMA Daisuke; +Cc: kexec, zhouwj-fnst, kumagai-atsushi

On 10/28/2014 02:24 PM, HATAYAMA Daisuke wrote:
> From: Atsushi Kumagai<kumagai-atsushi@mxc.nes.nec.co.jp>
> Subject: RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
> Date: Mon, 27 Oct 2014 07:51:56 +0000
>
>> Hello Zhou,
>>
>>> On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
>>>> Hello,
>>>>
>>>> The code looks good to me, thanks Zhou.
>>>> Now, I have a question on performance.
>>>>
>>>>> The issue is discussed at http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>>>>
>>>>> This patch implements the idea of 2-pass algorhythm with smaller memory to manage splitblock table.
>>>>> Exactly the algorhythm is still 3-pass,but the time of second pass is much shorter.
>>>>> The tables below show the performence with different size of cyclic-buffer and splitblock.
>>>>> The test is executed on the machine having 128G memory.
>>>>>
>>>>> the value is total time (including first pass and second pass).
>>>>> the value in brackets is the time of second pass.
>>>>
>>>> Do you have any idea why the time of second pass is much larger when
>>>> the splitblock-size is 2G ? I worry about the scalability.
>>>>
>>> Hello,
>>>
>>> 	Since the previous machine can't be used for some reasons,I test several times using latest code
>>> in others, but that never happened. It seems that all things are right. Tests are executed in two machines(server,pc).
>>> Tests are based on:
>>
>> Well...OK, I'll take that as an issue specific to that machine
>> (or your mistakes as you said).
>> Now I have another question.
>>
>> calculate_end_pfn_by_splitblock():
>> 	...
>>          /* deal with incomplete splitblock */
>>          if (pfn_needed_by_per_dumpfile<  0) {
>>                  --*current_splitblock;
>>                  splitblock_inner -= splitblock->entry_size;
>>                  end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
>>                  *current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
>>                  pfn_needed_by_per_dumpfile += read_value_from_splitblock_table(splitblock_inner);
>>                  end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
>>                                                       CURRENT_SPLITBLOCK_PFN_NUM + splitblock->page_per_splitblock,
>>                                                       end_pfn,pfn_needed_by_per_dumpfile);
>>          }
>>
>> This block causes the re-scanning for the cycle corresponding to the
>> current_splitblock, so the larger cyc-buf, the longer the time takes.
>> If cyc-buf is 4096 (this means the number of cycle is 1), the whole page
>> scanning will be done in the second pass. Actually, the performance when
>> cyc-buf=4096 was so bad.
>>
>> Is this process necessary ? I think splitting splitblocks is overkill
>> because I understood that splblk-size is the granularity of the
>> fairness I/O, tuning splblk-size is a trade off between fairness and
>> memory usage.
>> However, there is no advantage to reducing splblk-size in the current
>> implementation, it just consumes large amounts of memory.
>> If we remove the process, we can avoid the whole page scanning in
>> the second pass and reducing splblk-size will be meaningful as I
>> expected.
>>
>
> Yes, I don't think this rescan works with this splitblock method,
> too. The idea of this splitblock method is to reduce the number of
> filitering processing from 3-times to 2-times at the expence of at
> most splitblock-size difference of each dump file. Doing rescan here
> doesn't fit to the idea.

Hello,

The only things that bothers me is without getting the exact pfn, some
of the split files may be empty, with no page stored in it. If this is
not a issue, I think the re-scanning is useless.

>
> --
> Thanks.
> HATAYAMA, Daisuke
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
> .
>


-- 
Regards
Qiao Nuohan

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/5] Add tools for reading and writing from splitblock table
  2014-10-13  9:34 ` [PATCH v2 2/5] Add tools for reading and writing from splitblock table Zhou Wenjian
@ 2014-10-28  6:42   ` HATAYAMA Daisuke
  0 siblings, 0 replies; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-28  6:42 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

From: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
Subject: [PATCH v2 2/5] Add tools for reading and writing from splitblock table
Date: Mon, 13 Oct 2014 17:34:23 +0800

> The function added in this patch, is used for writing and reading value
> from the char array in struct SplitBlock.
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
> ---
>  makedumpfile.c |   23 +++++++++++++++++++++++
>  1 files changed, 23 insertions(+), 0 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 95d553c..a8d86f6 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -5708,6 +5708,29 @@ calculate_entry_size(void){
>  	return entry_size;
>  }
>  
> +void
> +write_value_into_splitblock_table(char *splitblock_inner, unsigned long long content)

This line exceeds 80 characters. The cause would be the name of
function arguments are too long.

How about this?

void
write_value_into_splitblock_table(char *entry, unsigned long long value)


> +{
> +	char temp;
> +	int i=0;

Please add linebreak.

Please add spaces between assignment operator.

int i = 0;

> +	while (i++ < splitblock->entry_size) {
> +		temp = content & 0xff;
> +		content = content >> BITPERBYTE;
> +		*splitblock_inner++ = temp;

Please write increment operator in a separate line.

                *splitblock_inner = temp;
                *splitblock_inner++;

> +	}
> +}
> +unsigned long long
> +read_value_from_splitblock_table(char *splitblock_inner)
> +{
> +	unsigned long long ret = 0;

How about value instead of ret? In addition to the above comment, this
also leads to contrasting consistency between the two helper
functions.

> +	int i;

Please add a linebreak.

> +	for (i = splitblock->entry_size; i > 0; i--) {
> +		ret = ret << BITPERBYTE;
> +		ret += *(splitblock_inner + i - 1) & 0xff;

The & 0xff is necessary? because splitblock_inner is of type char *.

> +	}
> +	return ret;
> +}
> +
>  mdf_pfn_t
>  get_num_dumpable(void)
>  {
> -- 
> 1.7.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 3/5] Add module of generating table
  2014-10-13  9:34 ` [PATCH v2 3/5] Add module of generating table Zhou Wenjian
@ 2014-10-28  7:01   ` HATAYAMA Daisuke
  0 siblings, 0 replies; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-28  7:01 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

From: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
Subject: [PATCH v2 3/5] Add module of generating table
Date: Mon, 13 Oct 2014 17:34:24 +0800

> set block size and generate basic information of block table
> 
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>

Please remove this Signed-off-by.

> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
> ---
>  makedumpfile.c |   95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  makedumpfile.h |    2 +
>  2 files changed, 96 insertions(+), 1 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a8d86f6..a6f0be4 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -5208,7 +5208,13 @@ create_dump_bitmap(void)
>  	if (info->flag_cyclic) {
>  		if (!prepare_bitmap2_buffer_cyclic())
>  			goto out;
> -		info->num_dumpable = get_num_dumpable_cyclic();
> +		if (info->flag_split){
> +			if(!prepare_splitblock_table())
> +				goto out;
> +			info->num_dumpable = get_num_dumpable_cyclic_withsplit();
> +		}
> +		else

Please:

                } else {

> +			info->num_dumpable = get_num_dumpable_cyclic();

                }

>  
>  		if (!info->flag_elf_dumpfile)
>  			free_bitmap2_buffer_cyclic();
> @@ -5731,6 +5737,57 @@ read_value_from_splitblock_table(char *splitblock_inner)
>  	return ret;
>  }
>  
> +/*
> + * The splitblock size is specified as Kbyte with --splitblock-size <size> option.
> + * if not specified ,set default value

if not specified, set default value

> + */
> +int
> +check_splitblock_size(void)
> +{
> +	if (info->splitblock_size){
> +		info->splitblock_size <<= 10;
> +		if (info->splitblock_size == 0) {
> +			ERRMSG("The splitblock size could not be 0. %s.\n", strerror(errno));

This line exceeds 80 characters.

> +			return FALSE;
> +		}
> +		if (info->splitblock_size % info->page_size != 0) {
> +			ERRMSG("The splitblock size must be align to page_size. %s.\n",
> +									strerror(errno));
> +			return FALSE;
> +		}
> +	}
> +	else{

Please:

        } else {

> +		// set default 1GB
> +		info->splitblock_size = 1 << 30;


Please define a default value in makedumpfile.h explicitly and use it
here. Just like:

#define DEFAULT_SPLITBLOCK_SIZE (1LL << 30)

Then, the comment is unnecessary.

> +	}
> +	return TRUE;
> +}
> +
> +int
> +prepare_splitblock_table(void)
> +{
> +	if(!check_splitblock_size())
> +		return FALSE;
> +	if ((splitblock = calloc(1, sizeof(struct SplitBlock))) == NULL) {
> +		ERRMSG("Can't allocate memory for the splitblock. %s.\n", strerror(errno));
> +		return FALSE;
> +	}
> +	splitblock->page_per_splitblock = info->splitblock_size / info->page_size;
> +	/*
> +	 *divide memory into splitblocks.
> +	 *if there is a remainder, called it memory not managed by splitblock
> +	 *and it will be also dealt with in function calculate_end_pfn_by_splitblock()
> +	 */

Could you rewrite this comment? I don't understand well.

> +	splitblock->num = info->max_mapnr/splitblock->page_per_splitblock;

Please add spaces:

splitblock->num = info->max_mapnr / splitblock->page_per_splitblock;

> +	splitblock->entry_size = calculate_entry_size();
> +	if ((splitblock->table = (char *)calloc(sizeof(char), (splitblock->entry_size * splitblock->num)))
> +										== NULL) {
> +		ERRMSG("Can't allocate memory for the splitblock_table. %s.\n", strerror(errno));
> +		return FALSE;
> +	}

Like this?

	size_t table_size;
...
	table_size = splitblock->entry_size * splitblock->num;
	splitblock->table = (char *)calloc(sizeof(char), table_size);
	if (!splitblock->table) {
		ERRMSG("Can't allocate memory for the splitblock_table. %s.\n",
	               strerror(errno));
		return FALSE;
	}

> +	return TRUE;
> +}
> +
>  mdf_pfn_t
>  get_num_dumpable(void)
>  {
> @@ -5746,6 +5803,36 @@ get_num_dumpable(void)
>  	return num_dumpable;
>  }
>  
> +/*
> + * generate splitblock_table
> + * modified from function get_num_dumpable_cyclic
> + */
> +mdf_pfn_t
> +get_num_dumpable_cyclic_withsplit(void)
> +{
> +	mdf_pfn_t pfn, num_dumpable = 0;
> +	mdf_pfn_t dumpable_pfn_num = 0, pfn_num = 0;
> +	struct cycle cycle = {0};
> +	int pos = 0;

Please add a linebreak.

> +	for_each_cycle(0, info->max_mapnr, &cycle) {
> +		if (!exclude_unnecessary_pages_cyclic(&cycle))
> +			return FALSE;
> +		for (pfn = cycle.start_pfn; pfn < cycle.end_pfn; pfn++) {
> +			if (is_dumpable_cyclic(info->partial_bitmap2, pfn, &cycle)) {
> +				num_dumpable++;
> +				dumpable_pfn_num++;
> +			}
> +			if (++pfn_num >= splitblock->page_per_splitblock) {
> +				write_value_into_splitblock_table(splitblock->table + pos, dumpable_pfn_num);
> +				pos += splitblock->entry_size;
> +				pfn_num = 0;
> +				dumpable_pfn_num = 0;
> +			}
> +		}
> +	}
> +	return num_dumpable;
> +}
> +
>  mdf_pfn_t
>  get_num_dumpable_cyclic(void)
>  {
> @@ -9703,6 +9790,12 @@ out:
>  		if (info->page_buf != NULL)
>  			free(info->page_buf);
>  		free(info);
> +
> +		if (splitblock) {
> +			if (splitblock->table)
> +				free(splitblock->table);
> +		free(splitblock);

Please add an indent.

       	   		free(splitblock);

> +		}
>  	}
>  	free_elf_info();
>  
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 98b8404..60e6f2f 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -1888,9 +1888,11 @@ struct elf_prstatus {
>   * Function Prototype.
>   */
>  mdf_pfn_t get_num_dumpable_cyclic(void);
> +mdf_pfn_t get_num_dumpable_cyclic_withsplit(void);
>  int get_loads_dumpfile_cyclic(void);
>  int initial_xen(void);
>  unsigned long long get_free_memory_size(void);
>  int calculate_cyclic_buffer_size(void);
> +int prepare_splitblock_table(void);
>  
>  #endif /* MAKEDUMPFILE_H */
> -- 
> 1.7.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 5/5] Add support for --splitblock-size
  2014-10-13  9:34 ` [PATCH v2 5/5] Add support for --splitblock-size Zhou Wenjian
@ 2014-10-28  7:15   ` HATAYAMA Daisuke
  2014-10-28  7:28     ` Atsushi Kumagai
  0 siblings, 1 reply; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-28  7:15 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

From: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
Subject: [PATCH v2 5/5] Add support for --splitblock-size
Date: Mon, 13 Oct 2014 17:34:26 +0800

> Use --splitblock-size to specify splitblock size (KB)
> When --split is specified in cyclic mode,splitblock table will be
> generated in create_dump_bitmap().
> 
> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
> ---
>  makedumpfile.8 |   16 ++++++++++++++++
>  makedumpfile.c |    4 ++++
>  makedumpfile.h |    1 +
>  print_info.c   |   16 +++++++++++++++-
>  4 files changed, 36 insertions(+), 1 deletions(-)
> 
> diff --git a/makedumpfile.8 b/makedumpfile.8
> index 9cb12c0..a5b7055 100644
> --- a/makedumpfile.8
> +++ b/makedumpfile.8
> @@ -386,6 +386,22 @@ size, so ordinary users don't need to specify this option.
>  # makedumpfile \-\-cyclic\-buffer 1024 \-d 31 \-x vmlinux /proc/vmcore dumpfile
>  
>  .TP
> +\fB\-\-splitblock\-size\fR \fIsplitblock_size\fR
> +Specify the splitblock size in kilo bytes for analysis in the cyclic mode with --split.
> +In the cyclic split mode, the number of splitblocks is represented as:
> +
> +    num_of_splitblocks = system_memory / (\fIsplitblock_size\fR * 1KB )
> +

For what people do you want to show this expression? It would be hard
for ordinary users to understand this expression. I think at least the
following explanation is necessary; and more detailed information is
verbose.

  If --splitblock N is specified, difference of each splitted dumpfile
  size is at most N kilo bytes.

> +The larger number of splitblock, the faster working speed is expected, but the more memory will
> +be taken. By default, \fIsplitblock_size\fR will be set as 1GB, so ordinary users don't need to
> +specify this option.
> +
> +.br
> +.B Example:
> +.br
> +# makedumpfile \-\-splitblock\-size 10240 \-d 31 \-x vmlinux \-\-split /proc/vmcore dumpfile1 dumpfile2
> +
> +.TP
>  \fB\-\-non\-cyclic\fR
>  Running in the non-cyclic mode, this mode uses the old filtering logic same as v1.4.4 or before.
>  If you feel the cyclic mode is too slow, please try this mode.
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 32c0919..112f2e4 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -9578,6 +9578,7 @@ static struct option longopts[] = {
>  	{"eppic", required_argument, NULL, OPT_EPPIC},
>  	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
>  	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
> +	{"splitblock-size", required_argument, NULL, OPT_SPLITBLOCK_SIZE},
>  	{0, 0, 0, 0}
>  };
>  
> @@ -9718,6 +9719,9 @@ main(int argc, char *argv[])
>  		case OPT_CYCLIC_BUFFER:
>  			info->bufsize_cyclic = atoi(optarg);
>  			break;
> +		case OPT_SPLITBLOCK_SIZE:
> +			info->splitblock_size = atoi(optarg);
> +			break;
>  		case '?':
>  			MSG("Commandline parameter is invalid.\n");
>  			MSG("Try `makedumpfile --help' for more information.\n");
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 60e6f2f..7bc57d9 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -1883,6 +1883,7 @@ struct elf_prstatus {
>  #define OPT_EPPIC               OPT_START+12
>  #define OPT_NON_MMAP            OPT_START+13
>  #define OPT_MEM_USAGE            OPT_START+14
> +#define OPT_SPLITBLOCK_SIZE		OPT_START+15
>  
>  /*
>   * Function Prototype.
> diff --git a/print_info.c b/print_info.c
> index f6342d3..2cdffd8 100644
> --- a/print_info.c
> +++ b/print_info.c
> @@ -203,7 +203,21 @@ print_usage(void)
>  	MSG("      By default, BUFFER_SIZE will be calculated automatically depending on\n");
>  	MSG("      system memory size, so ordinary users don't need to specify this option.\n");
>  	MSG("\n");
> -	MSG("  [--non-cyclic]:\n");
> +	MSG("  [--splitblock-size SPLITBLOCK_SIZE]:\n");
> +	MSG("      Specify the splitblock size in kilo bytes for analysis in the cyclic mode\n");
> +	MSG("      with --split.\n");
> +	MSG("      In the cyclic mode, the number of splitblocks is represented as:\n");
> +	MSG("\n");
> +	MSG("          num_of_splitblocks = system_memory / (splitblock_size * 1KB)\n");

Just the same as the above comment.

> +	MSG("\n");
> +	MSG("	   The larger number of splitblock, the faster working speed is expected, but\n");
> +	MSG("	   the more memory will be taken. By default, splitblock_size will be set as\n");
> +	MSG("	   1GB, so ordinary users don't need to specify this option.\n");
> +	MSG("\n");
> +	MSG("      The lesser number of cycles, the faster working speed is expected.\n");
> +	MSG("      By default, BUFFER_SIZE will be calculated automatically depending on\n");
> +	MSG("      system memory size, so ordinary users don't need to specify this option.\n");
> +	MSG("\n");	MSG("  [--non-cyclic]:\n");
>  	MSG("      Running in the non-cyclic mode, this mode uses the old filtering logic\n");
>  	MSG("      same as v1.4.4 or before.\n");
>  	MSG("      If you feel the cyclic mode is too slow, please try this mode.\n");
> -- 
> 1.7.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH v2 5/5] Add support for --splitblock-size
  2014-10-28  7:15   ` HATAYAMA Daisuke
@ 2014-10-28  7:28     ` Atsushi Kumagai
  0 siblings, 0 replies; 21+ messages in thread
From: Atsushi Kumagai @ 2014-10-28  7:28 UTC (permalink / raw)
  To: zhouwj-fnst@cn.fujitsu.com
  Cc: d.hatayama@jp.fujitsu.com, kexec@lists.infradead.org

>> Use --splitblock-size to specify splitblock size (KB)
>> When --split is specified in cyclic mode,splitblock table will be
>> generated in create_dump_bitmap().
>>
>> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
>> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
>> ---
>>  makedumpfile.8 |   16 ++++++++++++++++
>>  makedumpfile.c |    4 ++++
>>  makedumpfile.h |    1 +
>>  print_info.c   |   16 +++++++++++++++-
>>  4 files changed, 36 insertions(+), 1 deletions(-)
>>
>> diff --git a/makedumpfile.8 b/makedumpfile.8
>> index 9cb12c0..a5b7055 100644
>> --- a/makedumpfile.8
>> +++ b/makedumpfile.8
>> @@ -386,6 +386,22 @@ size, so ordinary users don't need to specify this option.
>>  # makedumpfile \-\-cyclic\-buffer 1024 \-d 31 \-x vmlinux /proc/vmcore dumpfile
>>
>>  .TP
>> +\fB\-\-splitblock\-size\fR \fIsplitblock_size\fR
>> +Specify the splitblock size in kilo bytes for analysis in the cyclic mode with --split.
>> +In the cyclic split mode, the number of splitblocks is represented as:
>> +
>> +    num_of_splitblocks = system_memory / (\fIsplitblock_size\fR * 1KB )
>> +
>
>For what people do you want to show this expression? It would be hard
>for ordinary users to understand this expression. I think at least the
>following explanation is necessary; and more detailed information is
>verbose.
>
>  If --splitblock N is specified, difference of each splitted dumpfile
>  size is at most N kilo bytes.
>
>> +The larger number of splitblock, the faster working speed is expected, but the more memory will
>> +be taken. By default, \fIsplitblock_size\fR will be set as 1GB, so ordinary users don't need to
>> +specify this option.
>> +
>> +.br
>> +.B Example:
>> +.br
>> +# makedumpfile \-\-splitblock\-size 10240 \-d 31 \-x vmlinux \-\-split /proc/vmcore dumpfile1 dumpfile2
>> +
>> +.TP
>>  \fB\-\-non\-cyclic\fR
>>  Running in the non-cyclic mode, this mode uses the old filtering logic same as v1.4.4 or before.
>>  If you feel the cyclic mode is too slow, please try this mode.
>> diff --git a/makedumpfile.c b/makedumpfile.c
>> index 32c0919..112f2e4 100644
>> --- a/makedumpfile.c
>> +++ b/makedumpfile.c
>> @@ -9578,6 +9578,7 @@ static struct option longopts[] = {
>>  	{"eppic", required_argument, NULL, OPT_EPPIC},
>>  	{"non-mmap", no_argument, NULL, OPT_NON_MMAP},
>>  	{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
>> +	{"splitblock-size", required_argument, NULL, OPT_SPLITBLOCK_SIZE},
>>  	{0, 0, 0, 0}
>>  };
>>
>> @@ -9718,6 +9719,9 @@ main(int argc, char *argv[])
>>  		case OPT_CYCLIC_BUFFER:
>>  			info->bufsize_cyclic = atoi(optarg);
>>  			break;
>> +		case OPT_SPLITBLOCK_SIZE:
>> +			info->splitblock_size = atoi(optarg);
>> +			break;
>>  		case '?':
>>  			MSG("Commandline parameter is invalid.\n");
>>  			MSG("Try `makedumpfile --help' for more information.\n");
>> diff --git a/makedumpfile.h b/makedumpfile.h
>> index 60e6f2f..7bc57d9 100644
>> --- a/makedumpfile.h
>> +++ b/makedumpfile.h
>> @@ -1883,6 +1883,7 @@ struct elf_prstatus {
>>  #define OPT_EPPIC               OPT_START+12
>>  #define OPT_NON_MMAP            OPT_START+13
>>  #define OPT_MEM_USAGE            OPT_START+14
>> +#define OPT_SPLITBLOCK_SIZE		OPT_START+15
>>
>>  /*
>>   * Function Prototype.
>> diff --git a/print_info.c b/print_info.c
>> index f6342d3..2cdffd8 100644
>> --- a/print_info.c
>> +++ b/print_info.c
>> @@ -203,7 +203,21 @@ print_usage(void)
>>  	MSG("      By default, BUFFER_SIZE will be calculated automatically depending on\n");
>>  	MSG("      system memory size, so ordinary users don't need to specify this option.\n");
>>  	MSG("\n");
>> -	MSG("  [--non-cyclic]:\n");
>> +	MSG("  [--splitblock-size SPLITBLOCK_SIZE]:\n");
>> +	MSG("      Specify the splitblock size in kilo bytes for analysis in the cyclic mode\n");
>> +	MSG("      with --split.\n");
>> +	MSG("      In the cyclic mode, the number of splitblocks is represented as:\n");
>> +	MSG("\n");
>> +	MSG("          num_of_splitblocks = system_memory / (splitblock_size * 1KB)\n");
>
>Just the same as the above comment.
>
>> +	MSG("\n");
>> +	MSG("	   The larger number of splitblock, the faster working speed is expected, but\n");
>> +	MSG("	   the more memory will be taken. By default, splitblock_size will be set as\n");
>> +	MSG("	   1GB, so ordinary users don't need to specify this option.\n");
>> +	MSG("\n");
>> +	MSG("      The lesser number of cycles, the faster working speed is expected.\n");
>> +	MSG("      By default, BUFFER_SIZE will be calculated automatically depending on\n");
>> +	MSG("      system memory size, so ordinary users don't need to specify this option.\n");

In addition, the last segment is a duplicate of --cyclic-buffer's.


Thanks,
Atsushi Kumagai

>> +	MSG("\n");	MSG("  [--non-cyclic]:\n");
>>  	MSG("      Running in the non-cyclic mode, this mode uses the old filtering logic\n");
>>  	MSG("      same as v1.4.4 or before.\n");
>>  	MSG("      If you feel the cyclic mode is too slow, please try this mode.\n");
>> --
>> 1.7.1
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>--
>Thanks.
>HATAYAMA, Daisuke
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile
  2014-10-13  9:34 ` [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
@ 2014-10-28  7:43   ` HATAYAMA Daisuke
  0 siblings, 0 replies; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-28  7:43 UTC (permalink / raw)
  To: zhouwj-fnst; +Cc: kexec

From: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
Subject: [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile
Date: Mon, 13 Oct 2014 17:34:25 +0800

> When --split is specified in cyclic mode, start_pfn and end_pfn of each dumpfile
> will be calculated to make each dumpfile have the same size.
> 
> Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>

Please remove this Signed-off-by.

> Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
> Signed-off-by: Zhou Wenjian <zhouwj-fnst@cn.fujitsu.com>
> ---
>  makedumpfile.c |  109 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
>  1 files changed, 104 insertions(+), 5 deletions(-)
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index a6f0be4..32c0919 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -8190,6 +8190,103 @@ out:
>  		return ret;
>  }
>  
> +/*
> + * calculate end pfn in incomplete splitblock or memory not managed by splitblock
> + */
> +mdf_pfn_t
> +calculate_end_pfn_in_cycle(mdf_pfn_t start, mdf_pfn_t max,
> +			    mdf_pfn_t end_pfn, long long pfn_needed_by_per_dumpfile)
> +{
> +	struct cycle cycle;

Please add a line.

> +	for_each_cycle(start,max,&cycle) {
> +		if (!exclude_unnecessary_pages_cyclic(&cycle))
> +			return FALSE;
> +		while (end_pfn < cycle.end_pfn) {
> +			end_pfn++;
> +			if (is_dumpable_cyclic(info->partial_bitmap2, end_pfn, &cycle)){

This line exceeds 80 columns. It would be enough to add linebreaks in
each argument posision.

			if (is_dumpable_cyclic(info->partial_bitmap2,
                                               end_pfn,
                                               &cycle)) {

> +				if (--pfn_needed_by_per_dumpfile <= 0)
> +					return ++end_pfn;

				pfn_needed_by_per_dumpfile--;
				if (pfn_needed_by_per_dumpfile <= 0) {
					end_pfn++;
					return end_pfn;
				}


> +			}
> +		}
> +	}
> +	return ++end_pfn;
> +}
> +
> +/*
> + * calculate end_pfn of one dumpfile.
> + * try to make every output file have the same size.
> + * splitblock_table is used to reduce calculate time.
> + */
> +
> +#define CURRENT_SPLITBLOCK_PFN_NUM (*current_splitblock * splitblock->page_per_splitblock)
> +mdf_pfn_t
> +calculate_end_pfn_by_splitblock(mdf_pfn_t start_pfn,
> +			   int *current_splitblock,
> +			   long long *current_splitblock_pfns){

			   long long *current_splitblock_pfns)
{

> +	mdf_pfn_t end_pfn;
> +	long long pfn_needed_by_per_dumpfile,offset;

Please add a space:

	long long pfn_needed_by_per_dumpfile, offset;

Also, some variable names are too long: current_splitblock,
current_splitblock_pfns and pfn_needed_by_per_dumpfile.

From current_splitblock, I imagine a SplitBlock object, but this is in
fact an inter representing an index of splitblocks.

For current_splitblock_pfns, is this really neccesary? Please see a
comment that will occur in later part.

> +	pfn_needed_by_per_dumpfile = info->num_dumpable / info->num_dumpfile;
> +	offset = *current_splitblock * splitblock->entry_size;
> +	end_pfn = start_pfn;

This is not a declaration. Please move downwards.

> +	char *splitblock_inner = splitblock->table + offset;

Please add a line.

> +	//calculate the part containing complete splitblock

Please use /* */ style.

> +	while (*current_splitblock < splitblock->num && pfn_needed_by_per_dumpfile > 0) {
> +		if (*current_splitblock_pfns > 0) {
> +			pfn_needed_by_per_dumpfile -= *current_splitblock_pfns ;
> +			*current_splitblock_pfns = 0 ;
> +		}
> +		else
> +		pfn_needed_by_per_dumpfile -= read_value_from_splitblock_table(splitblock_inner);

Ah, this line exceeds 80 columns: Please:

		} else {
pfn_needed_by_per_dumpfile -= read_value_from_splitblock_table(splitblock_inner);
		}

Again, pfn_needed_by_per_dumpfile is too long.

> +		splitblock_inner += splitblock->entry_size;
> +		++*current_splitblock;
> +	}

Please add a line.

> +	//deal with complete splitblock

Please use /* */ style.

What does "complete" mean?

> +	if (pfn_needed_by_per_dumpfile == 0)
> +		end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;

Please add a line.

> +	//deal with incomplete splitblock

Please use /* */ style.

What does "incomplete" mean?

> +	if (pfn_needed_by_per_dumpfile < 0) {

Just as I've already commented in another mail, I think filtering here
is unnecessary. Is it enough to add this splitblock into a current
dumpfile, not next dumpfile? Then, this code is unecessary, and also
current_splitblock_pfns is unnecessary.

> +		--*current_splitblock;
> +		splitblock_inner -= splitblock->entry_size;
> +		end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
> +		*current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
> +		pfn_needed_by_per_dumpfile += read_value_from_splitblock_table(splitblock_inner);
> +		end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
> +						     CURRENT_SPLITBLOCK_PFN_NUM+splitblock->page_per_splitblock,
> +						     end_pfn,pfn_needed_by_per_dumpfile);
> +	}

Please add a line.

> +	//deal with memory not managed by splitblock

Please use /* */ style.

> +	if (pfn_needed_by_per_dumpfile > 0 && *current_splitblock >= splitblock->num) {
> +		mdf_pfn_t cycle_start_pfn = MAX(CURRENT_SPLITBLOCK_PFN_NUM,end_pfn);
> +		end_pfn=calculate_end_pfn_in_cycle(cycle_start_pfn,
> +						   info->max_mapnr,
> +						   end_pfn,
> +						   pfn_needed_by_per_dumpfile);
> +	}
> +	return end_pfn;
> +}

Please add a line.

> +/*
> + * calculate start_pfn and end_pfn in each output file.
> + */
> +static int setup_splitting_cyclic(void)
> +{
> +	int i;
> +	mdf_pfn_t start_pfn, end_pfn;
> +	long long current_splitblock_pfns = 0;
> +	int current_splitblock = 0;

Please add a line.

> +	start_pfn = end_pfn = 0;
> +	for (i = 0; i < info->num_dumpfile - 1; i++) {
> +		start_pfn = end_pfn;
> +		end_pfn = calculate_end_pfn_by_splitblock(start_pfn,
> +							  &current_splitblock,
> +							  &current_splitblock_pfns);
> +		SPLITTING_START_PFN(i) = start_pfn;
> +		SPLITTING_END_PFN(i) = end_pfn;
> +	}
> +	SPLITTING_START_PFN(info->num_dumpfile - 1) = end_pfn;
> +	SPLITTING_END_PFN(info->num_dumpfile - 1) = info->max_mapnr;
> +	return TRUE;
> +}
> +
>  int
>  setup_splitting(void)
>  {
> @@ -8203,12 +8300,14 @@ setup_splitting(void)
>  		return FALSE;
>  
>  	if (info->flag_cyclic) {
> -		for (i = 0; i < info->num_dumpfile; i++) {
> -			SPLITTING_START_PFN(i) = divideup(info->max_mapnr, info->num_dumpfile) * i;
> -			SPLITTING_END_PFN(i)   = divideup(info->max_mapnr, info->num_dumpfile) * (i + 1);
> +		int ret = FALSE;

Please add a line.

> +		if(!prepare_bitmap2_buffer_cyclic()){
> +			free_bitmap_buffer();
> +			return ret;
>  		}
> -		if (SPLITTING_END_PFN(i-1) > info->max_mapnr)
> -			SPLITTING_END_PFN(i-1) = info->max_mapnr;
> +		ret = setup_splitting_cyclic();
> +		free_bitmap2_buffer_cyclic();
> +		return ret;
>          } else {
>  		initialize_2nd_bitmap(&bitmap2);
>  
> -- 
> 1.7.1
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-28  6:32         ` qiaonuohan
@ 2014-10-30  0:21           ` HATAYAMA Daisuke
  2014-11-05  4:18             ` Atsushi Kumagai
  0 siblings, 1 reply; 21+ messages in thread
From: HATAYAMA Daisuke @ 2014-10-30  0:21 UTC (permalink / raw)
  To: qiaonuohan; +Cc: kexec, zhouwj-fnst, kumagai-atsushi

From: qiaonuohan <qiaonuohan@cn.fujitsu.com>
Subject: Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
Date: Tue, 28 Oct 2014 14:32:12 +0800

> On 10/28/2014 02:24 PM, HATAYAMA Daisuke wrote:
>> From: Atsushi Kumagai<kumagai-atsushi@mxc.nes.nec.co.jp>
>> Subject: RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O
>> workloads in appropriate time
>> Date: Mon, 27 Oct 2014 07:51:56 +0000
>>
>>> Hello Zhou,
>>>
>>>> On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
>>>>> Hello,
>>>>>
>>>>> The code looks good to me, thanks Zhou.
>>>>> Now, I have a question on performance.
>>>>>
>>>>>> The issue is discussed at
>>>>>> http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>>>>>
>>>>>> This patch implements the idea of 2-pass algorhythm with smaller
>>>>>> memory to manage splitblock table.
>>>>>> Exactly the algorhythm is still 3-pass,but the time of second pass is
>>>>>> much shorter.
>>>>>> The tables below show the performence with different size of
>>>>>> cyclic-buffer and splitblock.
>>>>>> The test is executed on the machine having 128G memory.
>>>>>>
>>>>>> the value is total time (including first pass and second pass).
>>>>>> the value in brackets is the time of second pass.
>>>>>
>>>>> Do you have any idea why the time of second pass is much larger when
>>>>> the splitblock-size is 2G ? I worry about the scalability.
>>>>>
>>>> Hello,
>>>>
>>>> 	Since the previous machine can't be used for some reasons,I test several
>>>> 	times using latest code
>>>> in others, but that never happened. It seems that all things are
>>>> right. Tests are executed in two machines(server,pc).
>>>> Tests are based on:
>>>
>>> Well...OK, I'll take that as an issue specific to that machine
>>> (or your mistakes as you said).
>>> Now I have another question.
>>>
>>> calculate_end_pfn_by_splitblock():
>>> 	...
>>>          /* deal with incomplete splitblock */
>>>          if (pfn_needed_by_per_dumpfile<  0) {
>>>                  --*current_splitblock;
>>>                  splitblock_inner -= splitblock->entry_size;
>>>                  end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
>>>                  *current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
>>>                  pfn_needed_by_per_dumpfile +=
>>>                  read_value_from_splitblock_table(splitblock_inner);
>>>                  end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
>>>                                                       CURRENT_SPLITBLOCK_PFN_NUM + splitblock->page_per_splitblock,
>>>                                                       end_pfn,pfn_needed_by_per_dumpfile);
>>>          }
>>>
>>> This block causes the re-scanning for the cycle corresponding to the
>>> current_splitblock, so the larger cyc-buf, the longer the time takes.
>>> If cyc-buf is 4096 (this means the number of cycle is 1), the whole
>>> page
>>> scanning will be done in the second pass. Actually, the performance
>>> when
>>> cyc-buf=4096 was so bad.
>>>
>>> Is this process necessary ? I think splitting splitblocks is overkill
>>> because I understood that splblk-size is the granularity of the
>>> fairness I/O, tuning splblk-size is a trade off between fairness and
>>> memory usage.
>>> However, there is no advantage to reducing splblk-size in the current
>>> implementation, it just consumes large amounts of memory.
>>> If we remove the process, we can avoid the whole page scanning in
>>> the second pass and reducing splblk-size will be meaningful as I
>>> expected.
>>>
>>
>> Yes, I don't think this rescan works with this splitblock method,
>> too. The idea of this splitblock method is to reduce the number of
>> filitering processing from 3-times to 2-times at the expence of at
>> most splitblock-size difference of each dump file. Doing rescan here
>> doesn't fit to the idea.
> 
> Hello,
> 
> The only things that bothers me is without getting the exact pfn, some
> of the split files may be empty, with no page stored in it. If this is
> not a issue, I think the re-scanning is useless.
> 

It is within the idea I wrote above that empty files can occur. But
there might be further improvement point to decrease possibility of
empty files. For example, how about deriving default splitblock size
from the actual number of dumpable pages, not constant 1GB?

--
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
  2014-10-30  0:21           ` HATAYAMA Daisuke
@ 2014-11-05  4:18             ` Atsushi Kumagai
  0 siblings, 0 replies; 21+ messages in thread
From: Atsushi Kumagai @ 2014-11-05  4:18 UTC (permalink / raw)
  To: d.hatayama@jp.fujitsu.com, qiaonuohan@cn.fujitsu.com
  Cc: zhouwj-fnst@cn.fujitsu.com, kexec@lists.infradead.org

>From: qiaonuohan <qiaonuohan@cn.fujitsu.com>
>Subject: Re: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time
>Date: Tue, 28 Oct 2014 14:32:12 +0800
>
>> On 10/28/2014 02:24 PM, HATAYAMA Daisuke wrote:
>>> From: Atsushi Kumagai<kumagai-atsushi@mxc.nes.nec.co.jp>
>>> Subject: RE: [PATCH v2 0/5] makedumpfile: --split: assign fair I/O
>>> workloads in appropriate time
>>> Date: Mon, 27 Oct 2014 07:51:56 +0000
>>>
>>>> Hello Zhou,
>>>>
>>>>> On 10/17/2014 11:50 AM, Atsushi Kumagai wrote:
>>>>>> Hello,
>>>>>>
>>>>>> The code looks good to me, thanks Zhou.
>>>>>> Now, I have a question on performance.
>>>>>>
>>>>>>> The issue is discussed at
>>>>>>> http://lists.infradead.org/pipermail/kexec/2014-March/011289.html
>>>>>>>
>>>>>>> This patch implements the idea of 2-pass algorhythm with smaller
>>>>>>> memory to manage splitblock table.
>>>>>>> Exactly the algorhythm is still 3-pass,but the time of second pass is
>>>>>>> much shorter.
>>>>>>> The tables below show the performence with different size of
>>>>>>> cyclic-buffer and splitblock.
>>>>>>> The test is executed on the machine having 128G memory.
>>>>>>>
>>>>>>> the value is total time (including first pass and second pass).
>>>>>>> the value in brackets is the time of second pass.
>>>>>>
>>>>>> Do you have any idea why the time of second pass is much larger when
>>>>>> the splitblock-size is 2G ? I worry about the scalability.
>>>>>>
>>>>> Hello,
>>>>>
>>>>> 	Since the previous machine can't be used for some reasons,I test several
>>>>> 	times using latest code
>>>>> in others, but that never happened. It seems that all things are
>>>>> right. Tests are executed in two machines(server,pc).
>>>>> Tests are based on:
>>>>
>>>> Well...OK, I'll take that as an issue specific to that machine
>>>> (or your mistakes as you said).
>>>> Now I have another question.
>>>>
>>>> calculate_end_pfn_by_splitblock():
>>>> 	...
>>>>          /* deal with incomplete splitblock */
>>>>          if (pfn_needed_by_per_dumpfile<  0) {
>>>>                  --*current_splitblock;
>>>>                  splitblock_inner -= splitblock->entry_size;
>>>>                  end_pfn = CURRENT_SPLITBLOCK_PFN_NUM;
>>>>                  *current_splitblock_pfns = (-1) * pfn_needed_by_per_dumpfile;
>>>>                  pfn_needed_by_per_dumpfile +=
>>>>                  read_value_from_splitblock_table(splitblock_inner);
>>>>                  end_pfn = calculate_end_pfn_in_cycle(CURRENT_SPLITBLOCK_PFN_NUM,
>>>>                                                       CURRENT_SPLITBLOCK_PFN_NUM +
>splitblock->page_per_splitblock,
>>>>                                                       end_pfn,pfn_needed_by_per_dumpfile);
>>>>          }
>>>>
>>>> This block causes the re-scanning for the cycle corresponding to the
>>>> current_splitblock, so the larger cyc-buf, the longer the time takes.
>>>> If cyc-buf is 4096 (this means the number of cycle is 1), the whole
>>>> page
>>>> scanning will be done in the second pass. Actually, the performance
>>>> when
>>>> cyc-buf=4096 was so bad.
>>>>
>>>> Is this process necessary ? I think splitting splitblocks is overkill
>>>> because I understood that splblk-size is the granularity of the
>>>> fairness I/O, tuning splblk-size is a trade off between fairness and
>>>> memory usage.
>>>> However, there is no advantage to reducing splblk-size in the current
>>>> implementation, it just consumes large amounts of memory.
>>>> If we remove the process, we can avoid the whole page scanning in
>>>> the second pass and reducing splblk-size will be meaningful as I
>>>> expected.
>>>>
>>>
>>> Yes, I don't think this rescan works with this splitblock method,
>>> too. The idea of this splitblock method is to reduce the number of
>>> filitering processing from 3-times to 2-times at the expence of at
>>> most splitblock-size difference of each dump file. Doing rescan here
>>> doesn't fit to the idea.
>>
>> Hello,
>>
>> The only things that bothers me is without getting the exact pfn, some
>> of the split files may be empty, with no page stored in it. If this is
>> not a issue, I think the re-scanning is useless.
>>
>
>It is within the idea I wrote above that empty files can occur. But
>there might be further improvement point to decrease possibility of
>empty files. For example, how about deriving default splitblock size
>from the actual number of dumpable pages, not constant 1GB?

Empty files don't cause any problems, it's just a secondary result
of allowing the difference within the splitblock size.
I don't think it's worth avoiding empty files, I prefer to keep the
code simple for this issue.


Thanks,
Atsushi Kumagai

>--
>Thanks.
>HATAYAMA, Daisuke
>
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-11-05  4:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-13  9:34 [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Zhou Wenjian
2014-10-13  9:34 ` [PATCH v2 1/5] Add support for splitblock Zhou Wenjian
2014-10-28  6:30   ` HATAYAMA Daisuke
2014-10-13  9:34 ` [PATCH v2 2/5] Add tools for reading and writing from splitblock table Zhou Wenjian
2014-10-28  6:42   ` HATAYAMA Daisuke
2014-10-13  9:34 ` [PATCH v2 3/5] Add module of generating table Zhou Wenjian
2014-10-28  7:01   ` HATAYAMA Daisuke
2014-10-13  9:34 ` [PATCH v2 4/5] Add module of calculating start_pfn and end_pfn in each dumpfile Zhou Wenjian
2014-10-28  7:43   ` HATAYAMA Daisuke
2014-10-13  9:34 ` [PATCH v2 5/5] Add support for --splitblock-size Zhou Wenjian
2014-10-28  7:15   ` HATAYAMA Daisuke
2014-10-28  7:28     ` Atsushi Kumagai
2014-10-17  3:50 ` [PATCH v2 0/5] makedumpfile: --split: assign fair I/O workloads in appropriate time Atsushi Kumagai
2014-10-22  1:52   ` "Zhou, Wenjian/周文剑"
2014-10-27  1:08     ` "Zhou, Wenjian/周文剑"
2014-10-27  7:51     ` Atsushi Kumagai
2014-10-28  6:24       ` HATAYAMA Daisuke
2014-10-28  6:32         ` qiaonuohan
2014-10-30  0:21           ` HATAYAMA Daisuke
2014-11-05  4:18             ` Atsushi Kumagai
2014-10-27  6:19   ` "Zhou, Wenjian/周文剑"

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox