* [PATCH v3 01/10] Add readpage_kdump_compressed_parallel
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 02/10] Add mappage_elf_parallel Zhou Wenjian
` (9 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
readpage_kdump_compressed_parallel is used to enable reading pages from
vmcore in kdump-compressed format parallel. fd_memory and bitmap_memory
should be initialized and offered to each thread individually to avoid
conflict.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou wenjian <zhouwj-fnst@cn.fujitsu.com>
---
makedumpfile.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 137 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index cc71f20..3657d4f 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -251,6 +251,20 @@ pfn_to_pos(mdf_pfn_t pfn)
return desc_pos;
}
+unsigned long
+pfn_to_pos_parallel(mdf_pfn_t pfn, struct dump_bitmap* bitmap_memory_parallel)
+{
+ unsigned long desc_pos;
+ mdf_pfn_t i;
+
+ desc_pos = info->valid_pages[pfn / BITMAP_SECT_LEN];
+ for (i = round(pfn, BITMAP_SECT_LEN); i < pfn; i++)
+ if (is_dumpable(bitmap_memory_parallel, i, NULL))
+ desc_pos++;
+
+ return desc_pos;
+}
+
int
read_page_desc(unsigned long long paddr, page_desc_t *pd)
{
@@ -293,6 +307,50 @@ read_page_desc(unsigned long long paddr, page_desc_t *pd)
return TRUE;
}
+int
+read_page_desc_parallel(int fd_memory, unsigned long long paddr,
+ page_desc_t *pd,
+ struct dump_bitmap* bitmap_memory_parallel)
+{
+ struct disk_dump_header *dh;
+ unsigned long desc_pos;
+ mdf_pfn_t pfn;
+ off_t offset;
+
+ /*
+ * Find page descriptor
+ */
+ dh = info->dh_memory;
+ offset
+ = (DISKDUMP_HEADER_BLOCKS + dh->sub_hdr_size + dh->bitmap_blocks)
+ * dh->block_size;
+ pfn = paddr_to_pfn(paddr);
+ desc_pos = pfn_to_pos_parallel(pfn, bitmap_memory_parallel);
+ offset += (off_t)desc_pos * sizeof(page_desc_t);
+ if (lseek(fd_memory, offset, SEEK_SET) < 0) {
+ ERRMSG("Can't seek %s. %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+
+ /*
+ * Read page descriptor
+ */
+ if (read(fd_memory, pd, sizeof(*pd)) != sizeof(*pd)) {
+ ERRMSG("Can't read %s. %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+
+ /*
+ * Sanity check
+ */
+ if (pd->size > dh->block_size)
+ return FALSE;
+
+ return TRUE;
+}
+
static void
unmap_cache(struct cache_entry *entry)
{
@@ -589,6 +647,85 @@ readpage_kdump_compressed(unsigned long long paddr, void *bufptr)
return TRUE;
}
+static int
+readpage_kdump_compressed_parallel(int fd_memory, unsigned long long paddr,
+ void *bufptr,
+ struct dump_bitmap* bitmap_memory_parallel)
+{
+ page_desc_t pd;
+ char buf[info->page_size], *rdbuf;
+ int ret;
+ unsigned long retlen;
+
+ if (!is_dumpable(bitmap_memory_parallel, paddr_to_pfn(paddr), NULL)) {
+ ERRMSG("pfn(%llx) is excluded from %s.\n",
+ paddr_to_pfn(paddr), info->name_memory);
+ return FALSE;
+ }
+
+ if (!read_page_desc_parallel(fd_memory, paddr, &pd,
+ bitmap_memory_parallel)) {
+ ERRMSG("Can't read page_desc: %llx\n", paddr);
+ return FALSE;
+ }
+
+ if (lseek(fd_memory, pd.offset, SEEK_SET) < 0) {
+ ERRMSG("Can't seek %s. %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+
+ /*
+ * Read page data
+ */
+ rdbuf = pd.flags & (DUMP_DH_COMPRESSED_ZLIB | DUMP_DH_COMPRESSED_LZO |
+ DUMP_DH_COMPRESSED_SNAPPY) ? buf : bufptr;
+ if (read(fd_memory, rdbuf, pd.size) != pd.size) {
+ ERRMSG("Can't read %s. %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+
+ if (pd.flags & DUMP_DH_COMPRESSED_ZLIB) {
+ retlen = info->page_size;
+ ret = uncompress((unsigned char *)bufptr, &retlen,
+ (unsigned char *)buf, pd.size);
+ if ((ret != Z_OK) || (retlen != info->page_size)) {
+ ERRMSG("Uncompress failed: %d\n", ret);
+ return FALSE;
+ }
+#ifdef USELZO
+ } else if (info->flag_lzo_support
+ && (pd.flags & DUMP_DH_COMPRESSED_LZO)) {
+ retlen = info->page_size;
+ ret = lzo1x_decompress_safe((unsigned char *)buf, pd.size,
+ (unsigned char *)bufptr, &retlen,
+ LZO1X_MEM_DECOMPRESS);
+ if ((ret != LZO_E_OK) || (retlen != info->page_size)) {
+ ERRMSG("Uncompress failed: %d\n", ret);
+ return FALSE;
+ }
+#endif
+#ifdef USESNAPPY
+ } else if ((pd.flags & DUMP_DH_COMPRESSED_SNAPPY)) {
+
+ ret = snappy_uncompressed_length(buf, pd.size, (size_t *)&retlen);
+ if (ret != SNAPPY_OK) {
+ ERRMSG("Uncompress failed: %d\n", ret);
+ return FALSE;
+ }
+
+ ret = snappy_uncompress(buf, pd.size, bufptr, (size_t *)&retlen);
+ if ((ret != SNAPPY_OK) || (retlen != info->page_size)) {
+ ERRMSG("Uncompress failed: %d\n", ret);
+ return FALSE;
+ }
+#endif
+ }
+
+ return TRUE;
+}
+
int
readmem(int type_addr, unsigned long long addr, void *bufptr, size_t size)
{
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 02/10] Add mappage_elf_parallel
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 01/10] Add readpage_kdump_compressed_parallel Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 03/10] Add readpage_elf_parallel Zhou Wenjian
` (8 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
mappage_elf_parallel is used to enable mmaping elf format to memory
parallelly. later patch will will use the mmapped memory to get data
of each page. fd_memory and mmap_cache should be initialized and offered
to each threads individually to avoid conflict.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
makedumpfile.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
makedumpfile.h | 14 ++++++++
2 files changed, 111 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 3657d4f..d1b4bc2 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -394,6 +394,46 @@ update_mmap_range(off_t offset, int initial) {
}
static int
+update_mmap_range_parallel(int fd_memory, off_t offset,
+ struct mmap_cache *mmap_cache)
+{
+ off_t start_offset, end_offset;
+ off_t map_size;
+ off_t max_offset = get_max_file_offset();
+ off_t pt_load_end = offset_to_pt_load_end(offset);
+
+ /*
+ * mmap_buf must be cleaned
+ */
+ if (mmap_cache->mmap_buf != MAP_FAILED)
+ munmap(mmap_cache->mmap_buf, mmap_cache->mmap_end_offset
+ - mmap_cache->mmap_start_offset);
+
+ /*
+ * offset for mmap() must be page aligned.
+ */
+ start_offset = roundup(offset, info->page_size);
+ end_offset = MIN(max_offset, round(pt_load_end, info->page_size));
+
+ if (!pt_load_end || (end_offset - start_offset) <= 0)
+ return FALSE;
+
+ map_size = MIN(end_offset - start_offset, info->mmap_region_size);
+
+ mmap_cache->mmap_buf = mmap(NULL, map_size, PROT_READ, MAP_PRIVATE,
+ fd_memory, start_offset);
+
+ if (mmap_cache->mmap_buf == MAP_FAILED) {
+ return FALSE;
+ }
+
+ mmap_cache->mmap_start_offset = start_offset;
+ mmap_cache->mmap_end_offset = start_offset + map_size;
+
+ return TRUE;
+}
+
+static int
is_mapped_with_mmap(off_t offset) {
if (info->flag_usemmap == MMAP_ENABLE
@@ -404,6 +444,15 @@ is_mapped_with_mmap(off_t offset) {
return FALSE;
}
+static int
+is_mapped_with_mmap_parallel(off_t offset, struct mmap_cache *mmap_cache) {
+ if (offset >= mmap_cache->mmap_start_offset
+ && offset < mmap_cache->mmap_end_offset)
+ return TRUE;
+ else
+ return FALSE;
+}
+
int
initialize_mmap(void) {
unsigned long long phys_start;
@@ -458,6 +507,54 @@ mappage_elf(unsigned long long paddr)
return info->mmap_buf + (offset - info->mmap_start_offset);
}
+static char *
+mappage_elf_parallel(int fd_memory, unsigned long long paddr,
+ struct mmap_cache *mmap_cache)
+{
+ off_t offset, offset2;
+ int flag_usemmap;
+
+ pthread_rwlock_rdlock(&info->usemmap_rwlock);
+ flag_usemmap = info->flag_usemmap;
+ pthread_rwlock_unlock(&info->usemmap_rwlock);
+ if (flag_usemmap != MMAP_ENABLE)
+ return NULL;
+
+ offset = paddr_to_offset(paddr);
+ if (!offset || page_is_fractional(offset))
+ return NULL;
+
+ offset2 = paddr_to_offset(paddr + info->page_size - 1);
+ if (!offset2)
+ return NULL;
+
+ if (offset2 - offset != info->page_size - 1)
+ return NULL;
+
+ if (!is_mapped_with_mmap_parallel(offset, mmap_cache) &&
+ !update_mmap_range_parallel(fd_memory, offset, mmap_cache)) {
+ ERRMSG("Can't read the dump memory(%s) with mmap().\n",
+ info->name_memory);
+
+ ERRMSG("This kernel might have some problems about mmap().\n");
+ ERRMSG("read() will be used instead of mmap() from now.\n");
+
+ /*
+ * Fall back to read().
+ */
+ pthread_rwlock_wrlock(&info->usemmap_rwlock);
+ info->flag_usemmap = MMAP_DISABLE;
+ pthread_rwlock_unlock(&info->usemmap_rwlock);
+ return NULL;
+ }
+
+ if (offset < mmap_cache->mmap_start_offset ||
+ offset + info->page_size > mmap_cache->mmap_end_offset)
+ return NULL;
+
+ return mmap_cache->mmap_buf + (offset - mmap_cache->mmap_start_offset);
+}
+
static int
read_from_vmcore(off_t offset, void *bufptr, unsigned long size)
{
diff --git a/makedumpfile.h b/makedumpfile.h
index 3d6661f..bff134e 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -42,6 +42,7 @@
#include "dwarf_info.h"
#include "diskdump_mod.h"
#include "sadump_mod.h"
+#include <pthread.h>
/*
* Result of command
@@ -956,6 +957,15 @@ typedef unsigned long int ulong;
typedef unsigned long long int ulonglong;
/*
+ * for parallel process
+ */
+struct mmap_cache {
+ char *mmap_buf;
+ off_t mmap_start_offset;
+ off_t mmap_end_offset;
+};
+
+/*
* makedumpfile header
* For re-arranging the dump data on different architecture, all the
* variables are defined by 64bits. The size of signature is aligned
@@ -1219,6 +1229,10 @@ struct DumpInfo {
* for cyclic_splitting mode, setup splitblock_size
*/
long long splitblock_size;
+ /*
+ * for parallel process
+ */
+ pthread_rwlock_t usemmap_rwlock;
};
extern struct DumpInfo *info;
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 03/10] Add readpage_elf_parallel
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 01/10] Add readpage_kdump_compressed_parallel Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 02/10] Add mappage_elf_parallel Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 04/10] Add read_pfn_parallel Zhou Wenjian
` (7 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
readpage_elf_parallel is used to enable reading pages from elf format
parallelly. fd_memory should be initialize and offered to each threads
individually to avoid conflict.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
makedumpfile.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 98 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index d1b4bc2..44c78b4 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -575,6 +575,27 @@ read_from_vmcore(off_t offset, void *bufptr, unsigned long size)
return TRUE;
}
+static int
+read_from_vmcore_parallel(int fd_memory, off_t offset, void *bufptr,
+ unsigned long size)
+{
+ const off_t failed = (off_t)-1;
+
+ if (lseek(fd_memory, offset, SEEK_SET) == failed) {
+ ERRMSG("Can't seek the dump memory(%s). (offset: %llx) %s\n",
+ info->name_memory, (unsigned long long)offset, strerror(errno));
+ return FALSE;
+ }
+
+ if (read(fd_memory, bufptr, size) != size) {
+ ERRMSG("Can't read the dump memory(%s). %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+
+ return TRUE;
+}
+
/*
* This function is specific for reading page from ELF.
*
@@ -669,6 +690,83 @@ readpage_elf(unsigned long long paddr, void *bufptr)
}
static int
+readpage_elf_parallel(int fd_memory, unsigned long long paddr, void *bufptr)
+{
+ off_t offset1, offset2;
+ size_t size1, size2;
+ unsigned long long phys_start, phys_end, frac_head = 0;
+
+ offset1 = paddr_to_offset(paddr);
+ offset2 = paddr_to_offset(paddr + info->page_size);
+ phys_start = paddr;
+ phys_end = paddr + info->page_size;
+
+ /*
+ * Check the case phys_start isn't aligned by page size like below:
+ *
+ * phys_start
+ * = 0x40ffda7000
+ * |<-- frac_head -->|------------- PT_LOAD -------------
+ * ----+-----------------------+---------------------+----
+ * | pfn:N | pfn:N+1 | ...
+ * ----+-----------------------+---------------------+----
+ * |
+ * pfn_to_paddr(pfn:N) # page size = 16k
+ * = 0x40ffda4000
+ */
+ if (!offset1) {
+ phys_start = page_head_to_phys_start(paddr);
+ offset1 = paddr_to_offset(phys_start);
+ frac_head = phys_start - paddr;
+ memset(bufptr, 0, frac_head);
+ }
+
+ /*
+ * Check the case phys_end isn't aligned by page size like the
+ * phys_start's case.
+ */
+ if (!offset2) {
+ phys_end = page_head_to_phys_end(paddr);
+ offset2 = paddr_to_offset(phys_end);
+ memset(bufptr + (phys_end - paddr), 0, info->page_size
+ - (phys_end - paddr));
+ }
+
+ /*
+ * Check the separated page on different PT_LOAD segments.
+ */
+ if (offset1 + (phys_end - phys_start) == offset2) {
+ size1 = phys_end - phys_start;
+ } else {
+ for (size1 = 1; size1 < info->page_size - frac_head; size1++) {
+ offset2 = paddr_to_offset(phys_start + size1);
+ if (offset1 + size1 != offset2)
+ break;
+ }
+ }
+
+ if(!read_from_vmcore_parallel(fd_memory, offset1, bufptr + frac_head,
+ size1)) {
+ ERRMSG("Can't read the dump memory(%s).\n",
+ info->name_memory);
+ return FALSE;
+ }
+
+ if (size1 + frac_head != info->page_size) {
+ size2 = phys_end - (phys_start + size1);
+
+ if(!read_from_vmcore_parallel(fd_memory, offset2,
+ bufptr + frac_head + size1, size2)) {
+ ERRMSG("Can't read the dump memory(%s).\n",
+ info->name_memory);
+ return FALSE;
+ }
+ }
+
+ return TRUE;
+}
+
+static int
readpage_kdump_compressed(unsigned long long paddr, void *bufptr)
{
page_desc_t pd;
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 04/10] Add read_pfn_parallel
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (2 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 03/10] Add readpage_elf_parallel Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 05/10] Add function to initial bitmap for parallel use Zhou Wenjian
` (6 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
read_pfn_parallel is used to enable reading pages from vmcore parallely.
Current supported format is kdump-compressed and elf, mmap elf format
is also supported.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
Makefile | 2 ++
makedumpfile.c | 34 ++++++++++++++++++++++++++++++++++
2 files changed, 36 insertions(+), 0 deletions(-)
diff --git a/Makefile b/Makefile
index fc21a3f..b1daf5b 100644
--- a/Makefile
+++ b/Makefile
@@ -67,6 +67,8 @@ LIBS := -lsnappy $(LIBS)
CFLAGS += -DUSESNAPPY
endif
+LIBS := -lpthread $(LIBS)
+
all: makedumpfile
$(OBJ_PART): $(SRC_PART)
diff --git a/makedumpfile.c b/makedumpfile.c
index 44c78b4..e15855b 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -6349,6 +6349,40 @@ read_pfn(mdf_pfn_t pfn, unsigned char *buf)
}
int
+read_pfn_parallel(int fd_memory, mdf_pfn_t pfn, unsigned char *buf,
+ struct dump_bitmap* bitmap_memory_parallel,
+ struct mmap_cache *mmap_cache)
+{
+ unsigned long long paddr;
+ unsigned long long pgaddr;
+
+ paddr = pfn_to_paddr(pfn);
+
+ pgaddr = PAGEBASE(paddr);
+
+ if (info->flag_refiltering) {
+ if (!readpage_kdump_compressed_parallel(fd_memory, pgaddr, buf,
+ bitmap_memory_parallel)) {
+ ERRMSG("Can't get the page data.\n");
+ return FALSE;
+ }
+ } else {
+ char *mapbuf = mappage_elf_parallel(fd_memory, pgaddr,
+ mmap_cache);
+ if (mapbuf) {
+ memcpy(buf, mapbuf, info->page_size);
+ } else {
+ if (!readpage_elf_parallel(fd_memory, pgaddr, buf)) {
+ ERRMSG("Can't get the page data.\n");
+ return FALSE;
+ }
+ }
+ }
+
+ return TRUE;
+}
+
+int
get_loads_dumpfile_cyclic(void)
{
int i, phnum, num_new_load = 0;
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 05/10] Add function to initial bitmap for parallel use
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (3 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 04/10] Add read_pfn_parallel Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 06/10] Add filter_data_buffer_parallel Zhou Wenjian
` (5 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
initialize_bitmap_memory_parallel and initialize_2nd_bitmap_parallel
is used for parallel process to avoid conflict on bitmap.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
makedumpfile.c | 20 ++++++++++++++++++++
makedumpfile.h | 18 ++++++++++++++++++
2 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index e15855b..9c5da35 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3411,6 +3411,16 @@ initialize_bitmap_memory(void)
return TRUE;
}
+void
+initialize_bitmap_memory_parallel(struct dump_bitmap *bitmap, int thread_num)
+{
+ bitmap->fd = FD_BITMAP_MEMORY_PARALLEL(thread_num);
+ bitmap->file_name = info->name_memory;
+ bitmap->no_block = -1;
+ memset(bitmap->buf, 0, BUFSIZE_BITMAP);
+ bitmap->offset = info->bitmap_memory->offset;
+}
+
int
calibrate_machdep_info(void)
{
@@ -3725,6 +3735,16 @@ initialize_2nd_bitmap(struct dump_bitmap *bitmap)
bitmap->offset = info->len_bitmap / 2;
}
+void
+initialize_2nd_bitmap_parallel(struct dump_bitmap *bitmap, int thread_num)
+{
+ bitmap->fd = FD_BITMAP_PARALLEL(thread_num);
+ bitmap->file_name = info->name_bitmap;
+ bitmap->no_block = -1;
+ memset(bitmap->buf, 0, BUFSIZE_BITMAP);
+ bitmap->offset = info->len_bitmap / 2;
+}
+
int
set_bitmap_file(struct dump_bitmap *bitmap, mdf_pfn_t pfn, int val)
{
diff --git a/makedumpfile.h b/makedumpfile.h
index bff134e..4b0709c 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -429,6 +429,11 @@ do { \
#define SPLITTING_SIZE_EI(i) info->splitting_info[i].size_eraseinfo
/*
+ * Macro for getting parallel info.
+ */
+#define FD_BITMAP_MEMORY_PARALLEL(i) info->parallel_info[i].fd_bitmap_memory
+#define FD_BITMAP_PARALLEL(i) info->parallel_info[i].fd_bitmap
+/*
* kernel version
*
* NOTE: the format of kernel_version is as follows
@@ -1000,6 +1005,18 @@ struct splitting_info {
unsigned long size_eraseinfo;
} splitting_info_t;
+struct parallel_info {
+ int fd_memory;
+ int fd_bitmap_memory;
+ int fd_bitmap;
+ unsigned char *buf;
+ unsigned char *buf_out;
+ struct mmap_cache *mmap_cache;
+#ifdef USELZO
+ lzo_bytep wrkmem;
+#endif
+} parallel_info_t;
+
struct ppc64_vmemmap {
unsigned long phys;
unsigned long virt;
@@ -1136,6 +1153,7 @@ struct DumpInfo {
char *name_dumpfile;
int num_dumpfile;
struct splitting_info *splitting_info;
+ struct parallel_info *parallel_info;
/*
* bitmap info:
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 06/10] Add filter_data_buffer_parallel
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (4 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 05/10] Add function to initial bitmap for parallel use Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 07/10] Add write_kdump_pages_parallel to allow parallel process Zhou Wenjian
` (4 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
filter_data_buffer_parallel is used to enable filtering buffer
parallely.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
erase_info.c | 29 ++++++++++++++++++++++++++++-
erase_info.h | 2 ++
2 files changed, 30 insertions(+), 1 deletions(-)
diff --git a/erase_info.c b/erase_info.c
index e0e0f71..0b253d7 100644
--- a/erase_info.c
+++ b/erase_info.c
@@ -2328,7 +2328,6 @@ extract_filter_info(unsigned long long start_paddr,
return TRUE;
}
-
/*
* External functions.
*/
@@ -2413,6 +2412,34 @@ filter_data_buffer(unsigned char *buf, unsigned long long paddr,
}
}
+/*
+ * Filter buffer if the physical address is in filter_info.
+ */
+void
+filter_data_buffer_parallel(unsigned char *buf, unsigned long long paddr,
+ size_t size, pthread_mutex_t *mutex)
+{
+ struct filter_info fl_info;
+ unsigned char *buf_ptr;
+ int found = FALSE;
+
+ while (TRUE) {
+ pthread_mutex_lock(mutex);
+ found = extract_filter_info(paddr, paddr + size, &fl_info);
+ pthread_mutex_unlock(mutex);
+
+ if (found) {
+ buf_ptr = buf + (fl_info.paddr - paddr);
+ if (fl_info.nullify)
+ memset(buf_ptr, 0, fl_info.size);
+ else
+ memset(buf_ptr, fl_info.erase_ch, fl_info.size);
+ } else {
+ break;
+ }
+ }
+}
+
unsigned long
get_size_eraseinfo(void)
{
diff --git a/erase_info.h b/erase_info.h
index 4d4957e..b363a40 100644
--- a/erase_info.h
+++ b/erase_info.h
@@ -60,6 +60,8 @@ extern unsigned long num_erase_info;
int gather_filter_info(void);
void clear_filter_info(void);
void filter_data_buffer(unsigned char *buf, unsigned long long paddr, size_t size);
+void filter_data_buffer_parallel(unsigned char *buf, unsigned long long paddr,
+ size_t size, pthread_mutex_t *mutex);
unsigned long get_size_eraseinfo(void);
int update_filter_info_raw(unsigned long long, int, int);
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 07/10] Add write_kdump_pages_parallel to allow parallel process
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (5 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 06/10] Add filter_data_buffer_parallel Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 08/10] Initial and free data used for " Zhou Wenjian
` (3 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Use several threads to read and compress pages and one thread to write
the produced pages into dumpfile. The produced pages will be stored in
a buffer, then the consumer thread will get pages from this buffer.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Zhou wenjian <zhouwj-fnst@cn.fujitsu.com>
---
makedumpfile.c | 439 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
makedumpfile.h | 45 ++++++
2 files changed, 484 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 9c5da35..d0211cf 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -235,6 +235,31 @@ is_in_same_page(unsigned long vaddr1, unsigned long vaddr2)
return FALSE;
}
+static inline unsigned long
+calculate_len_buf_out(long page_size)
+{
+ unsigned long len_buf_out_zlib, len_buf_out_lzo, len_buf_out_snappy;
+ unsigned long len_buf_out;
+
+ len_buf_out_zlib = len_buf_out_lzo = len_buf_out_snappy = 0;
+
+#ifdef USELZO
+ len_buf_out_lzo = page_size + page_size / 16 + 64 + 3;
+#endif
+
+#ifdef USESNAPPY
+ len_buf_out_snappy = snappy_max_compressed_length(page_size);
+#endif
+
+ len_buf_out_zlib = compressBound(page_size);
+
+ len_buf_out = MAX(len_buf_out_zlib,
+ MAX(len_buf_out_lzo,
+ len_buf_out_snappy));
+
+ return len_buf_out;
+}
+
#define BITMAP_SECT_LEN 4096
static inline int is_dumpable(struct dump_bitmap *, mdf_pfn_t, struct cycle *cycle);
unsigned long
@@ -6671,6 +6696,420 @@ write_elf_pages_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
return TRUE;
}
+void *
+kdump_thread_function_cyclic(void *arg) {
+ /*
+ * lock memory to reduce page_faults by compress2()
+ */
+ void *temp = malloc(1);
+ memset(temp, 0, 1);
+ mlockall(MCL_CURRENT);
+ free(temp);
+
+ void *retval = PTHREAD_FAIL;
+ struct thread_args *kdump_thread_args = (struct thread_args *)arg;
+ struct page_data *page_data_buf = kdump_thread_args->page_data_buf;
+ struct cycle *cycle = kdump_thread_args->cycle;
+ int page_data_num = kdump_thread_args->page_data_num;
+ mdf_pfn_t pfn;
+ int index;
+ int found;
+ int dumpable;
+ int fd_memory = 0;
+ struct dump_bitmap bitmap_parallel = {0};
+ struct dump_bitmap bitmap_memory_parallel = {0};
+ unsigned char *buf = NULL, *buf_out = NULL;
+ struct mmap_cache *mmap_cache =
+ MMAP_CACHE_PARALLEL(kdump_thread_args->thread_num);
+ unsigned long size_out;
+#ifdef USELZO
+ lzo_bytep wrkmem = WRKMEM_PARALLEL(kdump_thread_args->thread_num);
+#endif
+#ifdef USESNAPPY
+ unsigned long len_buf_out_snappy =
+ snappy_max_compressed_length(info->page_size);
+#endif
+
+ buf = BUF_PARALLEL(kdump_thread_args->thread_num);
+ buf_out = BUF_OUT_PARALLEL(kdump_thread_args->thread_num);
+
+ fd_memory = FD_MEMORY_PARALLEL(kdump_thread_args->thread_num);
+
+ if (info->fd_bitmap) {
+ bitmap_parallel.buf = malloc(BUFSIZE_BITMAP);
+ initialize_2nd_bitmap_parallel(&bitmap_parallel,
+ kdump_thread_args->thread_num);
+ }
+
+ if (info->flag_refiltering) {
+ bitmap_memory_parallel.buf = malloc(BUFSIZE_BITMAP);
+ initialize_bitmap_memory_parallel(&bitmap_memory_parallel,
+ kdump_thread_args->thread_num);
+ }
+
+ while (1) {
+ /* get next pfn */
+ pthread_mutex_lock(&info->current_pfn_mutex);
+ pfn = info->current_pfn;
+ info->current_pfn++;
+ pthread_mutex_unlock(&info->current_pfn_mutex);
+
+ if (pfn >= kdump_thread_args->end_pfn)
+ break;
+
+ index = -1;
+ found = FALSE;
+
+ while (found == FALSE) {
+ /*
+ * need a cancellation point here
+ */
+ sleep(0);
+
+ index = pfn % page_data_num;
+
+ if (pfn - info->consumed_pfn > info->num_buffers)
+ continue;
+
+ if (page_data_buf[index].ready != 0)
+ continue;
+
+ pthread_mutex_lock(&page_data_buf[index].mutex);
+
+ if (page_data_buf[index].ready != 0)
+ goto unlock;
+
+ found = TRUE;
+
+ page_data_buf[index].pfn = pfn;
+ page_data_buf[index].ready = 1;
+
+ if (!info->fd_bitmap)
+ dumpable = is_dumpable(info->bitmap2,
+ pfn,
+ cycle);
+ else
+ dumpable = is_dumpable(&bitmap_parallel,
+ pfn,
+ cycle);
+ if (!dumpable) {
+ page_data_buf[index].dumpable = FALSE;
+ goto unlock;
+ }
+
+ page_data_buf[index].dumpable = TRUE;
+
+ if (!read_pfn_parallel(fd_memory, pfn, buf,
+ &bitmap_memory_parallel,
+ mmap_cache))
+ goto fail;
+
+ filter_data_buffer_parallel(buf, pfn_to_paddr(pfn),
+ info->page_size,
+ &info->filter_mutex);
+
+ if ((info->dump_level & DL_EXCLUDE_ZERO)
+ && is_zero_page(buf, info->page_size)) {
+ page_data_buf[index].zero = TRUE;
+ goto unlock;
+ }
+
+ page_data_buf[index].zero = FALSE;
+
+ /*
+ * Compress the page data.
+ */
+ size_out = kdump_thread_args->len_buf_out;
+ if ((info->flag_compress & DUMP_DH_COMPRESSED_ZLIB)
+ && ((size_out = kdump_thread_args->len_buf_out),
+ compress2(buf_out, &size_out, buf,
+ info->page_size,
+ Z_BEST_SPEED) == Z_OK)
+ && (size_out < info->page_size)) {
+ page_data_buf[index].flags =
+ DUMP_DH_COMPRESSED_ZLIB;
+ page_data_buf[index].size = size_out;
+ memcpy(page_data_buf[index].buf, buf_out, size_out);
+#ifdef USELZO
+ } else if (info->flag_lzo_support
+ && (info->flag_compress
+ & DUMP_DH_COMPRESSED_LZO)
+ && ((size_out = info->page_size),
+ lzo1x_1_compress(buf, info->page_size,
+ buf_out, &size_out,
+ wrkmem) == LZO_E_OK)
+ && (size_out < info->page_size)) {
+ page_data_buf[index].flags =
+ DUMP_DH_COMPRESSED_LZO;
+ page_data_buf[index].size = size_out;
+ memcpy(page_data_buf[index].buf, buf_out, size_out);
+#endif
+#ifdef USESNAPPY
+ } else if ((info->flag_compress
+ & DUMP_DH_COMPRESSED_SNAPPY)
+ && ((size_out = len_buf_out_snappy),
+ snappy_compress((char *)buf,
+ info->page_size,
+ (char *)buf_out,
+ (size_t *)&size_out)
+ == SNAPPY_OK)
+ && (size_out < info->page_size)) {
+ page_data_buf[index].flags =
+ DUMP_DH_COMPRESSED_SNAPPY;
+ page_data_buf[index].size = size_out;
+ memcpy(page_data_buf[index].buf, buf_out, size_out);
+#endif
+ } else {
+ page_data_buf[index].flags = 0;
+ page_data_buf[index].size = info->page_size;
+ memcpy(page_data_buf[index].buf, buf, info->page_size);
+ }
+unlock:
+ pthread_mutex_unlock(&page_data_buf[index].mutex);
+
+ }
+ }
+
+ retval = NULL;
+
+fail:
+ if (bitmap_memory_parallel.fd > 0)
+ close(bitmap_memory_parallel.fd);
+ if (bitmap_parallel.buf != NULL)
+ free(bitmap_parallel.buf);
+ if (bitmap_memory_parallel.buf != NULL)
+ free(bitmap_memory_parallel.buf);
+
+ pthread_exit(retval);
+}
+
+int
+write_kdump_pages_parallel_cyclic(struct cache_data *cd_header,
+ struct cache_data *cd_page,
+ struct page_desc *pd_zero,
+ off_t *offset_data, struct cycle *cycle)
+{
+ int ret = FALSE;
+ int res;
+ unsigned long len_buf_out;
+ mdf_pfn_t per;
+ mdf_pfn_t start_pfn, end_pfn;
+ struct page_desc pd;
+ struct timeval tv_start;
+ struct timeval last, new;
+ unsigned long long consuming_pfn;
+ pthread_t **threads = NULL;
+ struct thread_args *kdump_thread_args = NULL;
+ void *thread_result;
+ int page_data_num;
+ struct page_data *page_data_buf = NULL;
+ int i;
+ int index;
+
+ if (info->flag_elf_dumpfile)
+ return FALSE;
+
+ res = pthread_mutex_init(&info->current_pfn_mutex, NULL);
+ if (res != 0) {
+ ERRMSG("Can't initialize current_pfn_mutex. %s\n",
+ strerror(res));
+ goto out;
+ }
+
+ res = pthread_mutex_init(&info->consumed_pfn_mutex, NULL);
+ if (res != 0) {
+ ERRMSG("Can't initialize consumed_pfn_mutex. %s\n",
+ strerror(res));
+ goto out;
+ }
+
+ res = pthread_mutex_init(&info->filter_mutex, NULL);
+ if (res != 0) {
+ ERRMSG("Can't initialize filter_mutex. %s\n", strerror(res));
+ goto out;
+ }
+
+ res = pthread_rwlock_init(&info->usemmap_rwlock, NULL);
+ if (res != 0) {
+ ERRMSG("Can't initialize usemmap_rwlock. %s\n", strerror(res));
+ goto out;
+ }
+
+ len_buf_out = calculate_len_buf_out(info->page_size);
+
+ per = info->num_dumpable / 10000;
+ per = per ? per : 1;
+
+ gettimeofday(&tv_start, NULL);
+
+ start_pfn = cycle->start_pfn;
+ end_pfn = cycle->end_pfn;
+
+ info->current_pfn = start_pfn;
+ info->consumed_pfn = start_pfn - 1;
+
+ threads = info->threads;
+ kdump_thread_args = info->kdump_thread_args;
+
+ page_data_num = info->num_buffers;
+ page_data_buf = info->page_data_buf;
+
+ for (i = 0; i < page_data_num; i++) {
+ /*
+ * producer will use pfn in page_data_buf to decide the
+ * consumed pfn
+ */
+ page_data_buf[i].pfn = start_pfn - 1;
+ page_data_buf[i].ready = 0;
+ res = pthread_mutex_init(&page_data_buf[i].mutex, NULL);
+ if (res != 0) {
+ ERRMSG("Can't initialize mutex of page_data_buf. %s\n",
+ strerror(res));
+ goto out;
+ }
+ }
+
+ for (i = 0; i < info->num_threads; i++) {
+ kdump_thread_args[i].thread_num = i;
+ kdump_thread_args[i].len_buf_out = len_buf_out;
+ kdump_thread_args[i].start_pfn = start_pfn;
+ kdump_thread_args[i].end_pfn = end_pfn;
+ kdump_thread_args[i].page_data_num = page_data_num;
+ kdump_thread_args[i].page_data_buf = page_data_buf;
+ kdump_thread_args[i].cycle = cycle;
+
+ res = pthread_create(threads[i], NULL,
+ kdump_thread_function_cyclic,
+ (void *)&kdump_thread_args[i]);
+ if (res != 0) {
+ ERRMSG("Can't create thread %d. %s\n",
+ i, strerror(res));
+ goto out;
+ }
+ }
+
+ consuming_pfn = start_pfn;
+ index = -1;
+
+ gettimeofday(&last, NULL);
+
+ while (consuming_pfn < end_pfn) {
+ index = consuming_pfn % page_data_num;
+
+ gettimeofday(&new, NULL);
+ if (new.tv_sec - last.tv_sec > WAIT_TIME) {
+ ERRMSG("Can't get data of pfn %llx.\n", consuming_pfn);
+ goto out;
+ }
+
+ /*
+ * check pfn first without mutex locked to reduce the time
+ * trying to lock the mutex
+ */
+ if (page_data_buf[index].pfn != consuming_pfn)
+ continue;
+
+ if (pthread_mutex_trylock(&page_data_buf[index].mutex) != 0)
+ continue;
+
+ /* check whether the found one is ready to be consumed */
+ if (page_data_buf[index].pfn != consuming_pfn ||
+ page_data_buf[index].ready != 1) {
+ goto unlock;
+ }
+
+ if ((num_dumped % per) == 0)
+ print_progress(PROGRESS_COPY, num_dumped, info->num_dumpable);
+
+ /* next pfn is found, refresh last here */
+ last = new;
+ consuming_pfn++;
+ info->consumed_pfn++;
+ page_data_buf[index].ready = 0;
+
+ if (page_data_buf[index].dumpable == FALSE)
+ goto unlock;
+
+ num_dumped++;
+
+ if (page_data_buf[index].zero == TRUE) {
+ if (!write_cache(cd_header, pd_zero, sizeof(page_desc_t)))
+ goto out;
+ pfn_zero++;
+ } else {
+ pd.flags = page_data_buf[index].flags;
+ pd.size = page_data_buf[index].size;
+ pd.page_flags = 0;
+ pd.offset = *offset_data;
+ *offset_data += pd.size;
+ /*
+ * Write the page header.
+ */
+ if (!write_cache(cd_header, &pd, sizeof(page_desc_t)))
+ goto out;
+ /*
+ * Write the page data.
+ */
+ if (!write_cache(cd_page, page_data_buf[index].buf, pd.size))
+ goto out;
+
+ }
+unlock:
+ pthread_mutex_unlock(&page_data_buf[index].mutex);
+ }
+
+ ret = TRUE;
+ /*
+ * print [100 %]
+ */
+ print_progress(PROGRESS_COPY, num_dumped, info->num_dumpable);
+ print_execution_time(PROGRESS_COPY, &tv_start);
+ PROGRESS_MSG("\n");
+
+out:
+ if (threads != NULL) {
+ for (i = 0; i < info->num_threads; i++) {
+ if (threads[i] != NULL) {
+ res = pthread_cancel(*threads[i]);
+ if (res != 0 && res != ESRCH)
+ ERRMSG("Can't cancel thread %d. %s\n",
+ i, strerror(res));
+ }
+ }
+
+ for (i = 0; i < info->num_threads; i++) {
+ if (threads[i] != NULL) {
+ res = pthread_join(*threads[i], &thread_result);
+ if (res != 0)
+ ERRMSG("Can't join with thread %d. %s\n",
+ i, strerror(res));
+
+ if (thread_result == PTHREAD_CANCELED)
+ DEBUG_MSG("Thread %d is cancelled.\n", i);
+ else if (thread_result == PTHREAD_FAIL)
+ DEBUG_MSG("Thread %d fails.\n", i);
+ else
+ DEBUG_MSG("Thread %d finishes.\n", i);
+
+ }
+ }
+ }
+
+ if (page_data_buf != NULL) {
+ for (i = 0; i < page_data_num; i++) {
+ pthread_mutex_destroy(&page_data_buf[i].mutex);
+ }
+ }
+
+ pthread_rwlock_destroy(&info->usemmap_rwlock);
+ pthread_mutex_destroy(&info->filter_mutex);
+ pthread_mutex_destroy(&info->consumed_pfn_mutex);
+ pthread_mutex_destroy(&info->current_pfn_mutex);
+
+ munlockall();
+ return ret;
+}
+
int
write_kdump_pages_cyclic(struct cache_data *cd_header, struct cache_data *cd_page,
struct page_desc *pd_zero, off_t *offset_data, struct cycle *cycle)
diff --git a/makedumpfile.h b/makedumpfile.h
index 4b0709c..5dbea60 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -431,8 +431,15 @@ do { \
/*
* Macro for getting parallel info.
*/
+#define FD_MEMORY_PARALLEL(i) info->parallel_info[i].fd_memory
#define FD_BITMAP_MEMORY_PARALLEL(i) info->parallel_info[i].fd_bitmap_memory
#define FD_BITMAP_PARALLEL(i) info->parallel_info[i].fd_bitmap
+#define BUF_PARALLEL(i) info->parallel_info[i].buf
+#define BUF_OUT_PARALLEL(i) info->parallel_info[i].buf_out
+#define MMAP_CACHE_PARALLEL(i) info->parallel_info[i].mmap_cache
+#ifdef USELZO
+#define WRKMEM_PARALLEL(i) info->parallel_info[i].wrkmem
+#endif
/*
* kernel version
*
@@ -964,12 +971,40 @@ typedef unsigned long long int ulonglong;
/*
* for parallel process
*/
+
+#define WAIT_TIME (60 * 10)
+#define PTHREAD_FAIL ((void *)-2)
+
struct mmap_cache {
char *mmap_buf;
off_t mmap_start_offset;
off_t mmap_end_offset;
};
+struct page_data
+{
+ mdf_pfn_t pfn;
+ int dumpable;
+ int zero;
+ unsigned int flags;
+ long size;
+ unsigned char *buf;
+ pthread_mutex_t mutex;
+ /*
+ * whether the page_data is ready to be consumed
+ */
+ int ready;
+};
+
+struct thread_args {
+ int thread_num;
+ unsigned long len_buf_out;
+ mdf_pfn_t start_pfn, end_pfn;
+ int page_data_num;
+ struct cycle *cycle;
+ struct page_data *page_data_buf;
+};
+
/*
* makedumpfile header
* For re-arranging the dump data on different architecture, all the
@@ -1250,7 +1285,17 @@ struct DumpInfo {
/*
* for parallel process
*/
+ int num_threads;
+ int num_buffers;
+ pthread_t **threads;
+ struct thread_args *kdump_thread_args;
+ struct page_data *page_data_buf;
pthread_rwlock_t usemmap_rwlock;
+ mdf_pfn_t current_pfn;
+ pthread_mutex_t current_pfn_mutex;
+ mdf_pfn_t consumed_pfn;
+ pthread_mutex_t consumed_pfn_mutex;
+ pthread_mutex_t filter_mutex;
};
extern struct DumpInfo *info;
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 08/10] Initial and free data used for parallel process
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (6 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 07/10] Add write_kdump_pages_parallel to allow parallel process Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 09/10] Make makedumpfile available to read and compress pages parallelly Zhou Wenjian
` (2 subsequent siblings)
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
This patch is used to initial/free data for parallel process and
the memory limit is concerned in this function.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
makedumpfile.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
makedumpfile.h | 1 +
2 files changed, 203 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index d0211cf..417741f 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1432,6 +1432,23 @@ open_dump_bitmap(void)
SPLITTING_FD_BITMAP(i) = fd;
}
}
+
+ if (info->num_threads) {
+ /*
+ * Reserve file descriptors of bitmap for creating dumpfiles
+ * parallelly, because a bitmap file will be unlinked just after
+ * this and it is not possible to open a bitmap file later.
+ */
+ for (i = 0; i < info->num_threads; i++) {
+ if ((fd = open(info->name_bitmap, O_RDONLY)) < 0) {
+ ERRMSG("Can't open the bitmap file(%s). %s\n",
+ info->name_bitmap, strerror(errno));
+ return FALSE;
+ }
+ FD_BITMAP_PARALLEL(i) = fd;
+ }
+ }
+
unlink(info->name_bitmap);
return TRUE;
@@ -3459,6 +3476,191 @@ calibrate_machdep_info(void)
}
int
+initial_for_parallel()
+{
+ unsigned long len_buf_out;
+ unsigned long page_data_buf_size;
+ unsigned long limit_size;
+ int page_data_num;
+ int i;
+
+ len_buf_out = calculate_len_buf_out(info->page_size);
+
+ /*
+ * allocate memory for threads
+ */
+ if ((info->threads = malloc(sizeof(pthread_t *) * info->num_threads))
+ == NULL) {
+ MSG("Can't allocate memory for threads. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+ memset(info->threads, 0, sizeof(pthread_t *) * info->num_threads);
+
+ if ((info->kdump_thread_args =
+ malloc(sizeof(struct thread_args) * info->num_threads))
+ == NULL) {
+ MSG("Can't allocate memory for arguments of threads. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+ memset(info->kdump_thread_args, 0, sizeof(struct thread_args) * info->num_threads);
+
+ for (i = 0; i < info->num_threads; i++) {
+ if ((info->threads[i] = malloc(sizeof(pthread_t))) == NULL) {
+ MSG("Can't allocate memory for thread %d. %s",
+ i, strerror(errno));
+ return FALSE;
+ }
+
+ if ((BUF_PARALLEL(i) = malloc(info->page_size)) == NULL) {
+ MSG("Can't allocate memory for the memory buffer. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+
+ if ((BUF_OUT_PARALLEL(i) = malloc(len_buf_out)) == NULL) {
+ MSG("Can't allocate memory for the compression buffer. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+
+ if ((MMAP_CACHE_PARALLEL(i) = malloc(sizeof(struct mmap_cache))) == NULL) {
+ MSG("Can't allocate memory for mmap_cache. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+
+ /*
+ * initial for mmap_cache
+ */
+ MMAP_CACHE_PARALLEL(i)->mmap_buf = MAP_FAILED;
+ MMAP_CACHE_PARALLEL(i)->mmap_start_offset = 0;
+ MMAP_CACHE_PARALLEL(i)->mmap_end_offset = 0;
+
+#ifdef USELZO
+ if ((WRKMEM_PARALLEL(i) = malloc(LZO1X_1_MEM_COMPRESS)) == NULL) {
+ MSG("Can't allocate memory for the working memory. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+#endif
+ }
+
+ /*
+ * get a safe number of page_data
+ */
+ page_data_buf_size = MAX(len_buf_out, info->page_size);
+
+ limit_size = (get_free_memory_size()
+ - MAP_REGION * info->num_threads) * 0.6;
+
+ page_data_num = limit_size / page_data_buf_size;
+
+ if (info->num_buffers != 0)
+ info->num_buffers = MIN(info->num_buffers, page_data_num);
+ else
+ info->num_buffers = MIN(PAGE_DATA_NUM, page_data_num);
+
+ DEBUG_MSG("Number of struct page_data for produce/consume: %d\n",
+ info->num_buffers);
+
+ /*
+ * allocate memory for page_data
+ */
+ if ((info->page_data_buf = malloc(sizeof(struct page_data) * info->num_buffers))
+ == NULL) {
+ MSG("Can't allocate memory for page_data_buf. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+ memset(info->page_data_buf, 0, sizeof(struct page_data) * info->num_buffers);
+
+ for (i = 0; i < info->num_buffers; i++) {
+ if ((info->page_data_buf[i].buf = malloc(page_data_buf_size)) == NULL) {
+ MSG("Can't allocate memory for buf of page_data_buf. %s\n",
+ strerror(errno));
+ return FALSE;
+ }
+ }
+
+ /*
+ * initial fd_memory for threads
+ */
+ for (i = 0; i < info->num_threads; i++) {
+ if ((FD_MEMORY_PARALLEL(i) = open(info->name_memory, O_RDONLY))
+ < 0) {
+ ERRMSG("Can't open the dump memory(%s). %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+
+ if ((FD_BITMAP_MEMORY_PARALLEL(i) =
+ open(info->name_memory, O_RDONLY)) < 0) {
+ ERRMSG("Can't open the dump memory(%s). %s\n",
+ info->name_memory, strerror(errno));
+ return FALSE;
+ }
+ }
+
+ return TRUE;
+}
+
+void
+free_for_parallel()
+{
+ int i;
+
+ if (info->threads != NULL) {
+ for (i = 0; i < info->num_threads; i++) {
+ if (info->threads[i] != NULL)
+ free(info->threads[i]);
+
+ if (BUF_PARALLEL(i) != NULL)
+ free(BUF_PARALLEL(i));
+
+ if (BUF_OUT_PARALLEL(i) != NULL)
+ free(BUF_OUT_PARALLEL(i));
+
+ if (MMAP_CACHE_PARALLEL(i) != NULL) {
+ if (MMAP_CACHE_PARALLEL(i)->mmap_buf !=
+ MAP_FAILED)
+ munmap(MMAP_CACHE_PARALLEL(i)->mmap_buf,
+ MMAP_CACHE_PARALLEL(i)->mmap_end_offset
+ - MMAP_CACHE_PARALLEL(i)->mmap_start_offset);
+
+ free(MMAP_CACHE_PARALLEL(i));
+ }
+#ifdef USELZO
+ if (WRKMEM_PARALLEL(i) != NULL)
+ free(WRKMEM_PARALLEL(i));
+#endif
+
+ }
+ free(info->threads);
+ }
+
+ if (info->kdump_thread_args != NULL)
+ free(info->kdump_thread_args);
+
+ if (info->page_data_buf != NULL) {
+ for (i = 0; i < info->num_buffers; i++) {
+ if (info->page_data_buf[i].buf != NULL)
+ free(info->page_data_buf[i].buf);
+ }
+ free(info->page_data_buf);
+ }
+
+ for (i = 0; i < info->num_threads; i++) {
+ if (FD_MEMORY_PARALLEL(i) > 0)
+ close(FD_MEMORY_PARALLEL(i));
+
+ if (FD_BITMAP_MEMORY_PARALLEL(i) > 0)
+ close(FD_BITMAP_MEMORY_PARALLEL(i));
+ }
+}
+
+int
initial(void)
{
off_t offset;
diff --git a/makedumpfile.h b/makedumpfile.h
index 5dbea60..d0760d9 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -972,6 +972,7 @@ typedef unsigned long long int ulonglong;
* for parallel process
*/
+#define PAGE_DATA_NUM (50)
#define WAIT_TIME (60 * 10)
#define PTHREAD_FAIL ((void *)-2)
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 09/10] Make makedumpfile available to read and compress pages parallelly
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (7 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 08/10] Initial and free data used for " Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 6:29 ` [PATCH v3 10/10] Add usage and manual about multiple threads process Zhou Wenjian
2015-07-21 7:10 ` [PATCH v3 00/10] makedumpfile: parallel processing "Zhou, Wenjian/周文剑"
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Using this patch, it is available to use multiple threads to read
and compress pages. This parallel process will save time.
Currently, sadump and xen kdump is not supported.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
makedumpfile.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
makedumpfile.h | 2 +
2 files changed, 68 insertions(+), 2 deletions(-)
diff --git a/makedumpfile.c b/makedumpfile.c
index 417741f..7003d10 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3857,6 +3857,27 @@ out:
DEBUG_MSG("Buffer size for the cyclic mode: %ld\n", info->bufsize_cyclic);
}
+ if (info->num_threads) {
+ if (is_xen_memory()) {
+ MSG("'--num-threads' option is disable,\n");
+ MSG("because %s is Xen's memory core image.\n",
+ info->name_memory);
+ return FALSE;
+ }
+
+ if (info->flag_sadump) {
+ MSG("'--num-threads' option is disable,\n");
+ MSG("because %s is sadump %s format.\n",
+ info->name_memory, sadump_format_type_name());
+ return FALSE;
+ }
+
+ if (!initial_for_parallel()) {
+ MSG("Fail to initial for parallel process.\n");
+ return FALSE;
+ }
+ }
+
if (!is_xen_memory() && !cache_init())
return FALSE;
@@ -7905,9 +7926,16 @@ write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_d
if (!write_kdump_bitmap2(&cycle))
return FALSE;
- if (!write_kdump_pages_cyclic(cd_header, cd_page, &pd_zero,
+ if (info->num_threads) {
+ if (!write_kdump_pages_parallel_cyclic(cd_header,
+ cd_page, &pd_zero,
+ &offset_data, &cycle))
+ return FALSE;
+ } else {
+ if (!write_kdump_pages_cyclic(cd_header, cd_page, &pd_zero,
&offset_data, &cycle))
- return FALSE;
+ return FALSE;
+ }
}
free_bitmap2_buffer();
@@ -9874,6 +9902,18 @@ check_param_for_creating_dumpfile(int argc, char *argv[])
if (info->flag_sadump_diskset && !sadump_is_supported_arch())
return FALSE;
+ if (info->num_threads) {
+ if (info->flag_split) {
+ MSG("--num-threads cannot used with --split.\n");
+ return FALSE;
+ }
+
+ if (info->flag_elf_dumpfile) {
+ MSG("--num-threads cannot used with ELF format.\n");
+ return FALSE;
+ }
+ }
+
if ((argc == optind + 2) && !info->flag_flatten
&& !info->flag_split
&& !info->flag_sadump_diskset) {
@@ -9938,6 +9978,18 @@ check_param_for_creating_dumpfile(int argc, char *argv[])
} else
return FALSE;
+ if (info->num_threads) {
+ if ((info->parallel_info =
+ malloc(sizeof(parallel_info_t) * info->num_threads))
+ == NULL) {
+ MSG("Can't allocate memory for parallel_info.\n");
+ return FALSE;
+ }
+
+ memset(info->parallel_info, 0, sizeof(parallel_info_t)
+ * info->num_threads);
+ }
+
return TRUE;
}
@@ -10254,6 +10306,8 @@ static struct option longopts[] = {
{"mem-usage", no_argument, NULL, OPT_MEM_USAGE},
{"splitblock-size", required_argument, NULL, OPT_SPLITBLOCK_SIZE},
{"work-dir", required_argument, NULL, OPT_WORKING_DIR},
+ {"num-threads", required_argument, NULL, OPT_NUM_THREADS},
+ {"num-buffers", required_argument, NULL, OPT_NUM_BUFFERS},
{0, 0, 0, 0}
};
@@ -10398,6 +10452,12 @@ main(int argc, char *argv[])
case OPT_WORKING_DIR:
info->working_dir = optarg;
break;
+ case OPT_NUM_THREADS:
+ info->num_threads = atoi(optarg);
+ break;
+ case OPT_NUM_BUFFERS:
+ info->num_buffers = atoi(optarg);
+ break;
case '?':
MSG("Commandline parameter is invalid.\n");
MSG("Try `makedumpfile --help' for more information.\n");
@@ -10541,6 +10601,8 @@ out:
else if (!info->flag_mem_usage)
MSG("makedumpfile Completed.\n");
+ free_for_parallel();
+
if (info) {
if (info->dh_memory)
free(info->dh_memory);
@@ -10568,6 +10630,8 @@ out:
free(info->p2m_mfn_frame_list);
if (info->page_buf != NULL)
free(info->page_buf);
+ if (info->parallel_info != NULL)
+ free(info->parallel_info);
free(info);
if (splitblock) {
diff --git a/makedumpfile.h b/makedumpfile.h
index d0760d9..9dfe5b6 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -2032,6 +2032,8 @@ struct elf_prstatus {
#define OPT_MEM_USAGE OPT_START+13
#define OPT_SPLITBLOCK_SIZE OPT_START+14
#define OPT_WORKING_DIR OPT_START+15
+#define OPT_NUM_THREADS OPT_START+16
+#define OPT_NUM_BUFFERS OPT_START+17
/*
* Function Prototype.
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* [PATCH v3 10/10] Add usage and manual about multiple threads process
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (8 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 09/10] Make makedumpfile available to read and compress pages parallelly Zhou Wenjian
@ 2015-07-21 6:29 ` Zhou Wenjian
2015-07-21 7:10 ` [PATCH v3 00/10] makedumpfile: parallel processing "Zhou, Wenjian/周文剑"
10 siblings, 0 replies; 18+ messages in thread
From: Zhou Wenjian @ 2015-07-21 6:29 UTC (permalink / raw)
To: kexec; +Cc: Qiao Nuohan
From: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com>
---
makedumpfile.8 | 24 ++++++++++++++++++++++++
print_info.c | 16 ++++++++++++++++
2 files changed, 40 insertions(+), 0 deletions(-)
diff --git a/makedumpfile.8 b/makedumpfile.8
index 2d38cd0..b400a14 100644
--- a/makedumpfile.8
+++ b/makedumpfile.8
@@ -12,6 +12,8 @@ makedumpfile \- make a small dumpfile of kdump
.br
\fBmakedumpfile\fR \-\-split [\fIOPTION\fR] [\-x \fIVMLINUX\fR|\-i \fIVMCOREINFO\fR] \fIVMCORE\fR \fIDUMPFILE1\fR \fIDUMPFILE2\fR [\fIDUMPFILE3\fR ..]
.br
+\fBmakedumpfile\fR [\fIOPTION\fR] [\-x \fIVMLINUX\fR|\-i \fIVMCOREINFO\fR] \-\-num\-threads \fITHREADNUM\fR [\-\-num\-buffers \fIBUFNUM\fR] \fIVMCORE\fR \fIDUMPFILE\fR
+.br
\fBmakedumpfile\fR \-\-reassemble \fIDUMPFILE1\fR \fIDUMPFILE2\fR [\fIDUMPFILE3\fR ..] \fIDUMPFILE\fR
.br
\fBmakedumpfile\fR \-g \fIVMCOREINFO\fR \-x \fIVMLINUX\fR
@@ -371,6 +373,28 @@ the kdump\-compressed format.
# makedumpfile \-\-split \-d 31 \-x vmlinux /proc/vmcore dumpfile1 dumpfile2
.TP
+\fB\-\-num\-threads\fR \fITHREADNUM\fR
+Using multiple threads to read and compress data of each page in parallel.
+And it will reduces time for saving \fIDUMPFILE\fR.
+This feature only supports creating \fIDUMPFILE\fR in kdump\-comressed
+format from \fIVMCORE\fR in kdump\-compressed format or elf format.
+.br
+.B Example:
+.br
+# makedumpfile \-d 31 \-\-num\-threads 4 /proc/vmcore dumpfile
+
+.TP
+\fB\-\-num\-buffers\fR \fIBUFNUM\fR
+This option is used for multiple threads process, please check \-\-num\-threads
+option. Multiple threads process will need buffers to store generated page
+data by threads temporarily, and this option is used to specify the number
+of pages can be stored.
+.br
+.B Example:
+.br
+# makedumpfile \-d 31 \-\-num\-threads 4 \-\-num\-buffers 30 /proc/vmcore dumpfile
+
+.TP
\fB\-\-reassemble\fR
Reassemble multiple \fIDUMPFILE\fRs, which are created by \-\-split option,
into one \fIDUMPFILE\fR. dumpfile1 and dumpfile2 are reassembled into dumpfile
diff --git a/print_info.c b/print_info.c
index 9c36bec..e8a6b40 100644
--- a/print_info.c
+++ b/print_info.c
@@ -76,6 +76,10 @@ print_usage(void)
MSG(" # makedumpfile --split [OPTION] [-x VMLINUX|-i VMCOREINFO] VMCORE DUMPFILE1\n");
MSG(" DUMPFILE2 [DUMPFILE3 ..]\n");
MSG("\n");
+ MSG(" Using multiple threads to create DUMPFILE in parallel:\n");
+ MSG(" # makedumpfile [OPTION] [-x VMLINUX|-i VMCOREINFO] --num-threads THREADNUM\n");
+ MSG(" [--num-buffers BUFNUM] VMCORE DUMPFILE1\n");
+ MSG("\n");
MSG(" Reassemble multiple DUMPFILEs:\n");
MSG(" # makedumpfile --reassemble DUMPFILE1 DUMPFILE2 [DUMPFILE3 ..] DUMPFILE\n");
MSG("\n");
@@ -184,6 +188,18 @@ print_usage(void)
MSG(" by the number of DUMPFILEs.\n");
MSG(" This feature supports only the kdump-compressed format.\n");
MSG("\n");
+ MSG(" [--num-threads THREADNUM]:\n");
+ MSG(" Using multiple threads to read and compress data of each page in parallel.\n");
+ MSG(" And it will reduces time for saving DUMPFILE.\n");
+ MSG(" This feature only supports creating DUMPFILE in kdump-comressed format from\n");
+ MSG(" VMCORE in kdump-compressed format or elf format.\n");
+ MSG("\n");
+ MSG(" [--num-buffers BUFNUM]:\n");
+ MSG(" This option is used for multiple threads process, please check --num-threads\n");
+ MSG(" option. Multiple threads process will need buffers to store generated page\n");
+ MSG(" data by threads temporarily, and this option is used to specify the number\n");
+ MSG(" of pages can be stored.\n");
+ MSG("\n");
MSG(" [--reassemble]:\n");
MSG(" Reassemble multiple DUMPFILEs, which are created by --split option,\n");
MSG(" into one DUMPFILE. dumpfile1 and dumpfile2 are reassembled into dumpfile.\n");
--
1.7.1
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply related [flat|nested] 18+ messages in thread* Re: [PATCH v3 00/10] makedumpfile: parallel processing
2015-07-21 6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
` (9 preceding siblings ...)
2015-07-21 6:29 ` [PATCH v3 10/10] Add usage and manual about multiple threads process Zhou Wenjian
@ 2015-07-21 7:10 ` "Zhou, Wenjian/周文剑"
2015-07-23 6:20 ` Atsushi Kumagai
10 siblings, 1 reply; 18+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2015-07-21 7:10 UTC (permalink / raw)
To: atsushi Kumagai; +Cc: kexec@lists.infradead.org
Hello Kumagai,
The PATCH v3 has improved the performance.
The performance degradation in PATCH v2 mainly caused by the page_fault
produced by the function compress2().
I wrote some codes to test the performance of compress2. It almost costs
the same time and produces the same amount of page_fault as executing compress2
in thread.
To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
+ /*
+ * lock memory to reduce page_faults by compress2()
+ */
+ void *temp = malloc(1);
+ memset(temp, 0, 1);
+ mlockall(MCL_CURRENT);
+ free(temp);
+
With this, using a thread or not almost has the same performance.
In our machine, I can get the same result as the following with PATCH v2.
> Test2-1:
> | threads | compress time | exec time |
> | 1 | 76.12 | 82.13 |
>
> Test2-2:
> | threads | compress time | exec time |
> | 1 | 41.97 | 51.46 |
I test the new patch set in the machine, and below is the results.
PATCH V2:
###################################
- System: PRIMEQUEST 1800E
- CPU: Intel(R) Xeon(R) CPU E7540
- memory: 32GB
###################################
************ makedumpfile -d 0 ******************
core-data 0 256 512 768 1024 1280 1536 1792
threads-num
-c
0 158 1505 2119 2129 1707 1483 1440 1273
4 207 589 672 673 636 564 536 514
8 176 327 377 387 367 336 314 291
12 191 272 295 306 288 259 257 240
************ makedumpfile -d 7 ******************
core-data 0 256 512 768 1024 1280 1536 1792
threads-num
-c
0 154 1508 2089 2133 1792 1660 1462 1312
4 203 594 684 701 627 592 535 503
8 172 326 377 393 366 334 313 286
12 182 273 295 308 283 258 249 237
PATCH v3:
###################################
- System: PRIMEQUEST 1800E
- CPU: Intel(R) Xeon(R) CPU E7540
- memory: 32GB
###################################
************ makedumpfile -d 0 ******************
core-data 0 256 512 768 1024 1280 1536 1792
threads-num
-c
0 192 1488 1830
4 62 393 477
8 78 211 258
************ makedumpfile -d 7 ******************
core-data 0 256 512 768 1024 1280 1536 1792
threads-num
-c
0 197 1475 1815
4 62 396 482
8 78 209 252
--
Thanks
Zhou Wenjian
On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
> This patch set implements parallel processing by means of multiple threads.
> With this patch set, it is available to use multiple threads to read
> and compress pages. This parallel process will save time.
> This feature only supports creating dumpfile in kdump-compressed format from
> vmcore in kdump-compressed format or elf format. Currently, sadump and
> xen kdump are not supported.
>
> Qiao Nuohan (10):
> Add readpage_kdump_compressed_parallel
> Add mappage_elf_parallel
> Add readpage_elf_parallel
> Add read_pfn_parallel
> Add function to initial bitmap for parallel use
> Add filter_data_buffer_parallel
> Add write_kdump_pages_parallel to allow parallel process
> Initial and free data used for parallel process
> Make makedumpfile available to read and compress pages parallelly
> Add usage and manual about multiple threads process
>
> Makefile | 2 +
> erase_info.c | 29 ++-
> erase_info.h | 2 +
> makedumpfile.8 | 24 ++
> makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> makedumpfile.h | 80 ++++
> print_info.c | 16 +
> 7 files changed, 1245 insertions(+), 3 deletions(-)
>
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread* RE: [PATCH v3 00/10] makedumpfile: parallel processing
2015-07-21 7:10 ` [PATCH v3 00/10] makedumpfile: parallel processing "Zhou, Wenjian/周文剑"
@ 2015-07-23 6:20 ` Atsushi Kumagai
2015-07-23 6:39 ` "Zhou, Wenjian/周文剑"
0 siblings, 1 reply; 18+ messages in thread
From: Atsushi Kumagai @ 2015-07-23 6:20 UTC (permalink / raw)
To: zhouwj-fnst@cn.fujitsu.com; +Cc: kexec@lists.infradead.org
>Hello Kumagai,
>
>The PATCH v3 has improved the performance.
>The performance degradation in PATCH v2 mainly caused by the page_fault
>produced by the function compress2().
>
>I wrote some codes to test the performance of compress2. It almost costs
>the same time and produces the same amount of page_fault as executing compress2
>in thread.
>
>To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>
>+ /*
>+ * lock memory to reduce page_faults by compress2()
>+ */
>+ void *temp = malloc(1);
>+ memset(temp, 0, 1);
>+ mlockall(MCL_CURRENT);
>+ free(temp);
>+
>
>With this, using a thread or not almost has the same performance.
Hmm... I can't get good results with this patch, many page faults still
occur. I guess mlock will change when page faults occur, but will not
change the total number of page faults.
Could you explain why compress2() causes many page faults only in thread,
then I may understand why this patch is meaningful.
Thanks
Atsushi Kumagai
>In our machine, I can get the same result as the following with PATCH v2.
> > Test2-1:
> > | threads | compress time | exec time |
> > | 1 | 76.12 | 82.13 |
>
> > Test2-2:
> > | threads | compress time | exec time |
> > | 1 | 41.97 | 51.46 |
>
>I test the new patch set in the machine, and below is the results.
>
>PATCH V2:
>###################################
>- System: PRIMEQUEST 1800E
>- CPU: Intel(R) Xeon(R) CPU E7540
>- memory: 32GB
>###################################
>************ makedumpfile -d 0 ******************
> core-data 0 256 512 768 1024 1280 1536 1792
> threads-num
>-c
> 0 158 1505 2119 2129 1707 1483 1440 1273
> 4 207 589 672 673 636 564 536 514
> 8 176 327 377 387 367 336 314 291
> 12 191 272 295 306 288 259 257 240
>
>************ makedumpfile -d 7 ******************
> core-data 0 256 512 768 1024 1280 1536 1792
> threads-num
>-c
> 0 154 1508 2089 2133 1792 1660 1462 1312
> 4 203 594 684 701 627 592 535 503
> 8 172 326 377 393 366 334 313 286
> 12 182 273 295 308 283 258 249 237
>
>
>
>PATCH v3:
>###################################
>- System: PRIMEQUEST 1800E
>- CPU: Intel(R) Xeon(R) CPU E7540
>- memory: 32GB
>###################################
>************ makedumpfile -d 0 ******************
> core-data 0 256 512 768 1024 1280 1536 1792
> threads-num
>-c
> 0 192 1488 1830
> 4 62 393 477
> 8 78 211 258
>
>************ makedumpfile -d 7 ******************
> core-data 0 256 512 768 1024 1280 1536 1792
> threads-num
>-c
> 0 197 1475 1815
> 4 62 396 482
> 8 78 209 252
>
>
>--
>Thanks
>Zhou Wenjian
>
>On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
>> This patch set implements parallel processing by means of multiple threads.
>> With this patch set, it is available to use multiple threads to read
>> and compress pages. This parallel process will save time.
>> This feature only supports creating dumpfile in kdump-compressed format from
>> vmcore in kdump-compressed format or elf format. Currently, sadump and
>> xen kdump are not supported.
>>
>> Qiao Nuohan (10):
>> Add readpage_kdump_compressed_parallel
>> Add mappage_elf_parallel
>> Add readpage_elf_parallel
>> Add read_pfn_parallel
>> Add function to initial bitmap for parallel use
>> Add filter_data_buffer_parallel
>> Add write_kdump_pages_parallel to allow parallel process
>> Initial and free data used for parallel process
>> Make makedumpfile available to read and compress pages parallelly
>> Add usage and manual about multiple threads process
>>
>> Makefile | 2 +
>> erase_info.c | 29 ++-
>> erase_info.h | 2 +
>> makedumpfile.8 | 24 ++
>> makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>> makedumpfile.h | 80 ++++
>> print_info.c | 16 +
>> 7 files changed, 1245 insertions(+), 3 deletions(-)
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/10] makedumpfile: parallel processing
2015-07-23 6:20 ` Atsushi Kumagai
@ 2015-07-23 6:39 ` "Zhou, Wenjian/周文剑"
2015-07-31 8:27 ` Atsushi Kumagai
0 siblings, 1 reply; 18+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2015-07-23 6:39 UTC (permalink / raw)
To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org
On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>> Hello Kumagai,
>>
>> The PATCH v3 has improved the performance.
>> The performance degradation in PATCH v2 mainly caused by the page_fault
>> produced by the function compress2().
>>
>> I wrote some codes to test the performance of compress2. It almost costs
>> the same time and produces the same amount of page_fault as executing compress2
>> in thread.
>>
>> To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>>
>> + /*
>> + * lock memory to reduce page_faults by compress2()
>> + */
>> + void *temp = malloc(1);
>> + memset(temp, 0, 1);
>> + mlockall(MCL_CURRENT);
>> + free(temp);
>> +
>>
>> With this, using a thread or not almost has the same performance.
>
> Hmm... I can't get good results with this patch, many page faults still
> occur. I guess mlock will change when page faults occur, but will not
> change the total number of page faults.
> Could you explain why compress2() causes many page faults only in thread,
> then I may understand why this patch is meaningful.
>
Actually, it will also cause so much page faults even not in thread, if
info->bitmap2 is not freed in makedumpfile.
I wrote some codes to test the performance of compress2().
<cut>
buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>
The codes almost like this.
It will cause much page faults.
But if the codes turn to be the following, it will be much better.
<cut>
temp = malloc(TEMP_SIZE);
memset(temp, 0, TEMP_SIZE);
free(temp);
buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>
TEMP_SIZE must be large enough.
(larger than 135097 will work,in my machine)
If in thread, the following codes can reduce the page faults.
<cut>
temp = malloc(1);
memset(temp, 0, 1);
mlockall(MCL_CURRENT);
free(temp);
buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>
I haven't known why.
--
Thanks
Zhou Wenjian
>
> Thanks
> Atsushi Kumagai
>
>> In our machine, I can get the same result as the following with PATCH v2.
>>> Test2-1:
>>> | threads | compress time | exec time |
>>> | 1 | 76.12 | 82.13 |
> >
>>> Test2-2:
>>> | threads | compress time | exec time |
>>> | 1 | 41.97 | 51.46 |
>>
>> I test the new patch set in the machine, and below is the results.
>>
>> PATCH V2:
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 0 ******************
>> core-data 0 256 512 768 1024 1280 1536 1792
>> threads-num
>> -c
>> 0 158 1505 2119 2129 1707 1483 1440 1273
>> 4 207 589 672 673 636 564 536 514
>> 8 176 327 377 387 367 336 314 291
>> 12 191 272 295 306 288 259 257 240
>>
>> ************ makedumpfile -d 7 ******************
>> core-data 0 256 512 768 1024 1280 1536 1792
>> threads-num
>> -c
>> 0 154 1508 2089 2133 1792 1660 1462 1312
>> 4 203 594 684 701 627 592 535 503
>> 8 172 326 377 393 366 334 313 286
>> 12 182 273 295 308 283 258 249 237
>>
>>
>>
>> PATCH v3:
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 0 ******************
>> core-data 0 256 512 768 1024 1280 1536 1792
>> threads-num
>> -c
>> 0 192 1488 1830
>> 4 62 393 477
>> 8 78 211 258
>>
>> ************ makedumpfile -d 7 ******************
>> core-data 0 256 512 768 1024 1280 1536 1792
>> threads-num
>> -c
>> 0 197 1475 1815
>> 4 62 396 482
>> 8 78 209 252
>>
>>
>> --
>> Thanks
>> Zhou Wenjian
>>
>> On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
>>> This patch set implements parallel processing by means of multiple threads.
>>> With this patch set, it is available to use multiple threads to read
>>> and compress pages. This parallel process will save time.
>>> This feature only supports creating dumpfile in kdump-compressed format from
>>> vmcore in kdump-compressed format or elf format. Currently, sadump and
>>> xen kdump are not supported.
>>>
>>> Qiao Nuohan (10):
>>> Add readpage_kdump_compressed_parallel
>>> Add mappage_elf_parallel
>>> Add readpage_elf_parallel
>>> Add read_pfn_parallel
>>> Add function to initial bitmap for parallel use
>>> Add filter_data_buffer_parallel
>>> Add write_kdump_pages_parallel to allow parallel process
>>> Initial and free data used for parallel process
>>> Make makedumpfile available to read and compress pages parallelly
>>> Add usage and manual about multiple threads process
>>>
>>> Makefile | 2 +
>>> erase_info.c | 29 ++-
>>> erase_info.h | 2 +
>>> makedumpfile.8 | 24 ++
>>> makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>> makedumpfile.h | 80 ++++
>>> print_info.c | 16 +
>>> 7 files changed, 1245 insertions(+), 3 deletions(-)
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread* RE: [PATCH v3 00/10] makedumpfile: parallel processing
2015-07-23 6:39 ` "Zhou, Wenjian/周文剑"
@ 2015-07-31 8:27 ` Atsushi Kumagai
2015-07-31 9:35 ` "Zhou, Wenjian/周文剑"
0 siblings, 1 reply; 18+ messages in thread
From: Atsushi Kumagai @ 2015-07-31 8:27 UTC (permalink / raw)
To: "Zhou, Wenjian/周文剑"
Cc: kexec@lists.infradead.org
>On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>>> Hello Kumagai,
>>>
>>> The PATCH v3 has improved the performance.
>>> The performance degradation in PATCH v2 mainly caused by the page_fault
>>> produced by the function compress2().
>>>
>>> I wrote some codes to test the performance of compress2. It almost costs
>>> the same time and produces the same amount of page_fault as executing compress2
>>> in thread.
>>>
>>> To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>>>
>>> + /*
>>> + * lock memory to reduce page_faults by compress2()
>>> + */
>>> + void *temp = malloc(1);
>>> + memset(temp, 0, 1);
>>> + mlockall(MCL_CURRENT);
>>> + free(temp);
>>> +
>>>
>>> With this, using a thread or not almost has the same performance.
>>
>> Hmm... I can't get good results with this patch, many page faults still
>> occur. I guess mlock will change when page faults occur, but will not
>> change the total number of page faults.
>> Could you explain why compress2() causes many page faults only in thread,
>> then I may understand why this patch is meaningful.
>>
>
>Actually, it will also cause so much page faults even not in thread, if
>info->bitmap2 is not freed in makedumpfile.
>
>I wrote some codes to test the performance of compress2().
>
><cut>
>buf = malloc(PAGE_SIZE);
>bufout = malloc(SIZE_OUT);
>memset(buf, 1, PAGE_SIZE / 2);
>while (1)
> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
><cut>
>
>The codes almost like this.
>It will cause much page faults.
>
>But if the codes turn to be the following, it will be much better.
>
><cut>
>temp = malloc(TEMP_SIZE);
>memset(temp, 0, TEMP_SIZE);
>free(temp);
>
>buf = malloc(PAGE_SIZE);
>bufout = malloc(SIZE_OUT);
>memset(buf, 1, PAGE_SIZE / 2);
>while (1)
> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
><cut>
>
>TEMP_SIZE must be large enough.
>(larger than 135097 will work,in my machine)
>
>
>If in thread, the following codes can reduce the page faults.
>
><cut>
>temp = malloc(1);
>memset(temp, 0, 1);
>mlockall(MCL_CURRENT);
>free(temp);
>
>buf = malloc(PAGE_SIZE);
>bufout = malloc(SIZE_OUT);
>memset(buf, 1, PAGE_SIZE / 2);
>while (1)
> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
><cut>
>
>I haven't known why.
I assume that we are facing the known issue of glibc:
https://sourceware.org/ml/libc-alpha/2015-03/msg00270.html
According to the thread above, per-thread arena is easy to be grown and
trimmed compared with main arena.
Actually compress2() calls malloc() and free() for compression each time
it is called, so every compression processing will cause page fault.
Moreover, I confirmed that many madvise(MADV_DONTNEED) are invoked only
when compress2() is called in thread.
OTOH, in lzo case, a temp buffer for working is allocated on the caller
side, so it can reduce the number of malloc()/free() pair.
(but I'm not sure why snappy doesn't hit this issue. The buffer size
for compression may be smaller than the trim threshold.)
Anyway, basically it's hard for zlib to avoid this issue on the application
side, it seems that we have to accept the performance degradation caused by it.
Unfortunately, the main target of this multi thread feature is zlib as you
measured, we should resolve this issue somehow.
Nevertheless, even now we can get some benefit of parallel processing,
so lets' start to discuss the implementation of the parallel processing
feature to accept this patch. I have some comments:
- read_pfn_parallel() doesn't use the cache feature(cache.c), is it
intentional with you ?
- Now --num-buffers is tunable but the man description and your benchmark
didn't mention what is the benefit of this parameter.
Thanks
Atsushi Kumagai
>--
>Thanks
>Zhou Wenjian
>
>>
>> Thanks
>> Atsushi Kumagai
>>
>>> In our machine, I can get the same result as the following with PATCH v2.
>>>> Test2-1:
>>>> | threads | compress time | exec time |
>>>> | 1 | 76.12 | 82.13 |
>> >
>>>> Test2-2:
>>>> | threads | compress time | exec time |
>>>> | 1 | 41.97 | 51.46 |
>>>
>>> I test the new patch set in the machine, and below is the results.
>>>
>>> PATCH V2:
>>> ###################################
>>> - System: PRIMEQUEST 1800E
>>> - CPU: Intel(R) Xeon(R) CPU E7540
>>> - memory: 32GB
>>> ###################################
>>> ************ makedumpfile -d 0 ******************
>>> core-data 0 256 512 768 1024 1280 1536 1792
>>> threads-num
>>> -c
>>> 0 158 1505 2119 2129 1707 1483 1440 1273
>>> 4 207 589 672 673 636 564 536 514
>>> 8 176 327 377 387 367 336 314 291
>>> 12 191 272 295 306 288 259 257 240
>>>
>>> ************ makedumpfile -d 7 ******************
>>> core-data 0 256 512 768 1024 1280 1536 1792
>>> threads-num
>>> -c
>>> 0 154 1508 2089 2133 1792 1660 1462 1312
>>> 4 203 594 684 701 627 592 535 503
>>> 8 172 326 377 393 366 334 313 286
>>> 12 182 273 295 308 283 258 249 237
>>>
>>>
>>>
>>> PATCH v3:
>>> ###################################
>>> - System: PRIMEQUEST 1800E
>>> - CPU: Intel(R) Xeon(R) CPU E7540
>>> - memory: 32GB
>>> ###################################
>>> ************ makedumpfile -d 0 ******************
>>> core-data 0 256 512 768 1024 1280 1536 1792
>>> threads-num
>>> -c
>>> 0 192 1488 1830
>>> 4 62 393 477
>>> 8 78 211 258
>>>
>>> ************ makedumpfile -d 7 ******************
>>> core-data 0 256 512 768 1024 1280 1536 1792
>>> threads-num
>>> -c
>>> 0 197 1475 1815
>>> 4 62 396 482
>>> 8 78 209 252
>>>
>>>
>>> --
>>> Thanks
>>> Zhou Wenjian
>>>
>>> On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
>>>> This patch set implements parallel processing by means of multiple threads.
>>>> With this patch set, it is available to use multiple threads to read
>>>> and compress pages. This parallel process will save time.
>>>> This feature only supports creating dumpfile in kdump-compressed format from
>>>> vmcore in kdump-compressed format or elf format. Currently, sadump and
>>>> xen kdump are not supported.
>>>>
>>>> Qiao Nuohan (10):
>>>> Add readpage_kdump_compressed_parallel
>>>> Add mappage_elf_parallel
>>>> Add readpage_elf_parallel
>>>> Add read_pfn_parallel
>>>> Add function to initial bitmap for parallel use
>>>> Add filter_data_buffer_parallel
>>>> Add write_kdump_pages_parallel to allow parallel process
>>>> Initial and free data used for parallel process
>>>> Make makedumpfile available to read and compress pages parallelly
>>>> Add usage and manual about multiple threads process
>>>>
>>>> Makefile | 2 +
>>>> erase_info.c | 29 ++-
>>>> erase_info.h | 2 +
>>>> makedumpfile.8 | 24 ++
>>>> makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>> makedumpfile.h | 80 ++++
>>>> print_info.c | 16 +
>>>> 7 files changed, 1245 insertions(+), 3 deletions(-)
>>>>
>>>>
>>>> _______________________________________________
>>>> kexec mailing list
>>>> kexec@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: [PATCH v3 00/10] makedumpfile: parallel processing
2015-07-31 8:27 ` Atsushi Kumagai
@ 2015-07-31 9:35 ` "Zhou, Wenjian/周文剑"
2015-08-05 2:46 ` "Zhou, Wenjian/周文剑"
0 siblings, 1 reply; 18+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2015-07-31 9:35 UTC (permalink / raw)
To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org
On 07/31/2015 04:27 PM, Atsushi Kumagai wrote:
>> On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>>>> Hello Kumagai,
>>>>
>>>> The PATCH v3 has improved the performance.
>>>> The performance degradation in PATCH v2 mainly caused by the page_fault
>>>> produced by the function compress2().
>>>>
>>>> I wrote some codes to test the performance of compress2. It almost costs
>>>> the same time and produces the same amount of page_fault as executing compress2
>>>> in thread.
>>>>
>>>> To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>>>>
>>>> + /*
>>>> + * lock memory to reduce page_faults by compress2()
>>>> + */
>>>> + void *temp = malloc(1);
>>>> + memset(temp, 0, 1);
>>>> + mlockall(MCL_CURRENT);
>>>> + free(temp);
>>>> +
>>>>
>>>> With this, using a thread or not almost has the same performance.
>>>
>>> Hmm... I can't get good results with this patch, many page faults still
>>> occur. I guess mlock will change when page faults occur, but will not
>>> change the total number of page faults.
>>> Could you explain why compress2() causes many page faults only in thread,
>>> then I may understand why this patch is meaningful.
>>>
>>
>> Actually, it will also cause so much page faults even not in thread, if
>> info->bitmap2 is not freed in makedumpfile.
>>
>> I wrote some codes to test the performance of compress2().
>>
>> <cut>
>> buf = malloc(PAGE_SIZE);
>> bufout = malloc(SIZE_OUT);
>> memset(buf, 1, PAGE_SIZE / 2);
>> while (1)
>> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
>> <cut>
>>
>> The codes almost like this.
>> It will cause much page faults.
>>
>> But if the codes turn to be the following, it will be much better.
>>
>> <cut>
>> temp = malloc(TEMP_SIZE);
>> memset(temp, 0, TEMP_SIZE);
>> free(temp);
>>
>> buf = malloc(PAGE_SIZE);
>> bufout = malloc(SIZE_OUT);
>> memset(buf, 1, PAGE_SIZE / 2);
>> while (1)
>> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
>> <cut>
>>
>> TEMP_SIZE must be large enough.
>> (larger than 135097 will work,in my machine)
>>
>>
>> If in thread, the following codes can reduce the page faults.
>>
>> <cut>
>> temp = malloc(1);
>> memset(temp, 0, 1);
>> mlockall(MCL_CURRENT);
>> free(temp);
>>
>> buf = malloc(PAGE_SIZE);
>> bufout = malloc(SIZE_OUT);
>> memset(buf, 1, PAGE_SIZE / 2);
>> while (1)
>> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
>> <cut>
>>
>> I haven't known why.
>
> I assume that we are facing the known issue of glibc:
>
> https://sourceware.org/ml/libc-alpha/2015-03/msg00270.html
>
> According to the thread above, per-thread arena is easy to be grown and
> trimmed compared with main arena.
> Actually compress2() calls malloc() and free() for compression each time
> it is called, so every compression processing will cause page fault.
> Moreover, I confirmed that many madvise(MADV_DONTNEED) are invoked only
> when compress2() is called in thread.
>
> OTOH, in lzo case, a temp buffer for working is allocated on the caller
> side, so it can reduce the number of malloc()/free() pair.
> (but I'm not sure why snappy doesn't hit this issue. The buffer size
> for compression may be smaller than the trim threshold.)
>
> Anyway, basically it's hard for zlib to avoid this issue on the application
> side, it seems that we have to accept the performance degradation caused by it.
> Unfortunately, the main target of this multi thread feature is zlib as you
> measured, we should resolve this issue somehow.
>
> Nevertheless, even now we can get some benefit of parallel processing,
> so lets' start to discuss the implementation of the parallel processing
> feature to accept this patch. I have some comments:
>
> - read_pfn_parallel() doesn't use the cache feature(cache.c), is it
> intentional with you ?
>
Yes, since the data are read once a page here, cache feature seems not
needed.
> - Now --num-buffers is tunable but the man description and your benchmark
> didn't mention what is the benefit of this parameter.
>
The default value of num-buffers is 50. Originally the value has great influence
on the performance. But since we changed the logic in the 2nd version of the
patch set, more buffers have little improvement(1000 buffers may have 1% improvement).
I'm considering if the option should be removed. what do you think about it?
BTW, the code (mlockall) added in the 3rd version works well in several machines here.
Should I keep it ?
With the codes, madvise(MADV_DONTNEED) will be failed in compress2 and the performance
is as expected in these machines.
--
Thanks
Zhou Wenjian
>
> Thanks
> Atsushi Kumagai
>
>> --
>> Thanks
>> Zhou Wenjian
>>
>>>
>>> Thanks
>>> Atsushi Kumagai
>>>
>>>> In our machine, I can get the same result as the following with PATCH v2.
>>>>> Test2-1:
>>>>> | threads | compress time | exec time |
>>>>> | 1 | 76.12 | 82.13 |
>>> >
>>>>> Test2-2:
>>>>> | threads | compress time | exec time |
>>>>> | 1 | 41.97 | 51.46 |
>>>>
>>>> I test the new patch set in the machine, and below is the results.
>>>>
>>>> PATCH V2:
>>>> ###################################
>>>> - System: PRIMEQUEST 1800E
>>>> - CPU: Intel(R) Xeon(R) CPU E7540
>>>> - memory: 32GB
>>>> ###################################
>>>> ************ makedumpfile -d 0 ******************
>>>> core-data 0 256 512 768 1024 1280 1536 1792
>>>> threads-num
>>>> -c
>>>> 0 158 1505 2119 2129 1707 1483 1440 1273
>>>> 4 207 589 672 673 636 564 536 514
>>>> 8 176 327 377 387 367 336 314 291
>>>> 12 191 272 295 306 288 259 257 240
>>>>
>>>> ************ makedumpfile -d 7 ******************
>>>> core-data 0 256 512 768 1024 1280 1536 1792
>>>> threads-num
>>>> -c
>>>> 0 154 1508 2089 2133 1792 1660 1462 1312
>>>> 4 203 594 684 701 627 592 535 503
>>>> 8 172 326 377 393 366 334 313 286
>>>> 12 182 273 295 308 283 258 249 237
>>>>
>>>>
>>>>
>>>> PATCH v3:
>>>> ###################################
>>>> - System: PRIMEQUEST 1800E
>>>> - CPU: Intel(R) Xeon(R) CPU E7540
>>>> - memory: 32GB
>>>> ###################################
>>>> ************ makedumpfile -d 0 ******************
>>>> core-data 0 256 512 768 1024 1280 1536 1792
>>>> threads-num
>>>> -c
>>>> 0 192 1488 1830
>>>> 4 62 393 477
>>>> 8 78 211 258
>>>>
>>>> ************ makedumpfile -d 7 ******************
>>>> core-data 0 256 512 768 1024 1280 1536 1792
>>>> threads-num
>>>> -c
>>>> 0 197 1475 1815
>>>> 4 62 396 482
>>>> 8 78 209 252
>>>>
>>>>
>>>> --
>>>> Thanks
>>>> Zhou Wenjian
>>>>
>>>> On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
>>>>> This patch set implements parallel processing by means of multiple threads.
>>>>> With this patch set, it is available to use multiple threads to read
>>>>> and compress pages. This parallel process will save time.
>>>>> This feature only supports creating dumpfile in kdump-compressed format from
>>>>> vmcore in kdump-compressed format or elf format. Currently, sadump and
>>>>> xen kdump are not supported.
>>>>>
>>>>> Qiao Nuohan (10):
>>>>> Add readpage_kdump_compressed_parallel
>>>>> Add mappage_elf_parallel
>>>>> Add readpage_elf_parallel
>>>>> Add read_pfn_parallel
>>>>> Add function to initial bitmap for parallel use
>>>>> Add filter_data_buffer_parallel
>>>>> Add write_kdump_pages_parallel to allow parallel process
>>>>> Initial and free data used for parallel process
>>>>> Make makedumpfile available to read and compress pages parallelly
>>>>> Add usage and manual about multiple threads process
>>>>>
>>>>> Makefile | 2 +
>>>>> erase_info.c | 29 ++-
>>>>> erase_info.h | 2 +
>>>>> makedumpfile.8 | 24 ++
>>>>> makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>> makedumpfile.h | 80 ++++
>>>>> print_info.c | 16 +
>>>>> 7 files changed, 1245 insertions(+), 3 deletions(-)
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> kexec mailing list
>>>>> kexec@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v3 00/10] makedumpfile: parallel processing
2015-07-31 9:35 ` "Zhou, Wenjian/周文剑"
@ 2015-08-05 2:46 ` "Zhou, Wenjian/周文剑"
2015-08-06 2:46 ` Atsushi Kumagai
0 siblings, 1 reply; 18+ messages in thread
From: "Zhou, Wenjian/周文剑" @ 2015-08-05 2:46 UTC (permalink / raw)
To: Atsushi Kumagai; +Cc: kexec@lists.infradead.org
ping...
--
Thanks
Zhou Wenjian
On 07/31/2015 05:35 PM, "Zhou, Wenjian/周文剑" wrote:
> On 07/31/2015 04:27 PM, Atsushi Kumagai wrote:
>>> On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>>>>> Hello Kumagai,
>>>>>
>>>>> The PATCH v3 has improved the performance.
>>>>> The performance degradation in PATCH v2 mainly caused by the page_fault
>>>>> produced by the function compress2().
>>>>>
>>>>> I wrote some codes to test the performance of compress2. It almost costs
>>>>> the same time and produces the same amount of page_fault as executing compress2
>>>>> in thread.
>>>>>
>>>>> To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>>>>>
>>>>> + /*
>>>>> + * lock memory to reduce page_faults by compress2()
>>>>> + */
>>>>> + void *temp = malloc(1);
>>>>> + memset(temp, 0, 1);
>>>>> + mlockall(MCL_CURRENT);
>>>>> + free(temp);
>>>>> +
>>>>>
>>>>> With this, using a thread or not almost has the same performance.
>>>>
>>>> Hmm... I can't get good results with this patch, many page faults still
>>>> occur. I guess mlock will change when page faults occur, but will not
>>>> change the total number of page faults.
>>>> Could you explain why compress2() causes many page faults only in thread,
>>>> then I may understand why this patch is meaningful.
>>>>
>>>
>>> Actually, it will also cause so much page faults even not in thread, if
>>> info->bitmap2 is not freed in makedumpfile.
>>>
>>> I wrote some codes to test the performance of compress2().
>>>
>>> <cut>
>>> buf = malloc(PAGE_SIZE);
>>> bufout = malloc(SIZE_OUT);
>>> memset(buf, 1, PAGE_SIZE / 2);
>>> while (1)
>>> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
>>> <cut>
>>>
>>> The codes almost like this.
>>> It will cause much page faults.
>>>
>>> But if the codes turn to be the following, it will be much better.
>>>
>>> <cut>
>>> temp = malloc(TEMP_SIZE);
>>> memset(temp, 0, TEMP_SIZE);
>>> free(temp);
>>>
>>> buf = malloc(PAGE_SIZE);
>>> bufout = malloc(SIZE_OUT);
>>> memset(buf, 1, PAGE_SIZE / 2);
>>> while (1)
>>> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
>>> <cut>
>>>
>>> TEMP_SIZE must be large enough.
>>> (larger than 135097 will work,in my machine)
>>>
>>>
>>> If in thread, the following codes can reduce the page faults.
>>>
>>> <cut>
>>> temp = malloc(1);
>>> memset(temp, 0, 1);
>>> mlockall(MCL_CURRENT);
>>> free(temp);
>>>
>>> buf = malloc(PAGE_SIZE);
>>> bufout = malloc(SIZE_OUT);
>>> memset(buf, 1, PAGE_SIZE / 2);
>>> while (1)
>>> compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
>>> <cut>
>>>
>>> I haven't known why.
>>
>> I assume that we are facing the known issue of glibc:
>>
>> https://sourceware.org/ml/libc-alpha/2015-03/msg00270.html
>>
>> According to the thread above, per-thread arena is easy to be grown and
>> trimmed compared with main arena.
>> Actually compress2() calls malloc() and free() for compression each time
>> it is called, so every compression processing will cause page fault.
>> Moreover, I confirmed that many madvise(MADV_DONTNEED) are invoked only
>> when compress2() is called in thread.
>>
>> OTOH, in lzo case, a temp buffer for working is allocated on the caller
>> side, so it can reduce the number of malloc()/free() pair.
>> (but I'm not sure why snappy doesn't hit this issue. The buffer size
>> for compression may be smaller than the trim threshold.)
>>
>> Anyway, basically it's hard for zlib to avoid this issue on the application
>> side, it seems that we have to accept the performance degradation caused by it.
>> Unfortunately, the main target of this multi thread feature is zlib as you
>> measured, we should resolve this issue somehow.
>>
>> Nevertheless, even now we can get some benefit of parallel processing,
>> so lets' start to discuss the implementation of the parallel processing
>> feature to accept this patch. I have some comments:
>>
>> - read_pfn_parallel() doesn't use the cache feature(cache.c), is it
>> intentional with you ?
>>
>
> Yes, since the data are read once a page here, cache feature seems not
> needed.
>
>> - Now --num-buffers is tunable but the man description and your benchmark
>> didn't mention what is the benefit of this parameter.
>>
>
> The default value of num-buffers is 50. Originally the value has great influence
> on the performance. But since we changed the logic in the 2nd version of the
> patch set, more buffers have little improvement(1000 buffers may have 1% improvement).
> I'm considering if the option should be removed. what do you think about it?
>
> BTW, the code (mlockall) added in the 3rd version works well in several machines here.
> Should I keep it ?
> With the codes, madvise(MADV_DONTNEED) will be failed in compress2 and the performance
> is as expected in these machines.
>
>
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH v3 00/10] makedumpfile: parallel processing
2015-08-05 2:46 ` "Zhou, Wenjian/周文剑"
@ 2015-08-06 2:46 ` Atsushi Kumagai
0 siblings, 0 replies; 18+ messages in thread
From: Atsushi Kumagai @ 2015-08-06 2:46 UTC (permalink / raw)
To: zhouwj-fnst@cn.fujitsu.com; +Cc: kexec@lists.infradead.org
>On 07/31/2015 05:35 PM, "Zhou, Wenjian/周文剑" wrote:
>> On 07/31/2015 04:27 PM, Atsushi Kumagai wrote:
>>>> On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>>>>>> Hello Kumagai,
>>>
>>> I assume that we are facing the known issue of glibc:
>>>
>>> https://sourceware.org/ml/libc-alpha/2015-03/msg00270.html
>>>
>>> According to the thread above, per-thread arena is easy to be grown and
>>> trimmed compared with main arena.
>>> Actually compress2() calls malloc() and free() for compression each time
>>> it is called, so every compression processing will cause page fault.
>>> Moreover, I confirmed that many madvise(MADV_DONTNEED) are invoked only
>>> when compress2() is called in thread.
>>>
>>> OTOH, in lzo case, a temp buffer for working is allocated on the caller
>>> side, so it can reduce the number of malloc()/free() pair.
>>> (but I'm not sure why snappy doesn't hit this issue. The buffer size
>>> for compression may be smaller than the trim threshold.)
>>>
>>> Anyway, basically it's hard for zlib to avoid this issue on the application
>>> side, it seems that we have to accept the performance degradation caused by it.
>>> Unfortunately, the main target of this multi thread feature is zlib as you
>>> measured, we should resolve this issue somehow.
>>>
>>> Nevertheless, even now we can get some benefit of parallel processing,
>>> so lets' start to discuss the implementation of the parallel processing
>>> feature to accept this patch. I have some comments:
>>>
>>> - read_pfn_parallel() doesn't use the cache feature(cache.c), is it
>>> intentional with you ?
>>>
>>
>> Yes, since the data are read once a page here, cache feature seems not
>> needed.
OK, I see.
>>
>>> - Now --num-buffers is tunable but the man description and your benchmark
>>> didn't mention what is the benefit of this parameter.
>>>
>>
>> The default value of num-buffers is 50. Originally the value has great influence
>> on the performance. But since we changed the logic in the 2nd version of the
>> patch set, more buffers have little improvement(1000 buffers may have 1% improvement).
>> I'm considering if the option should be removed. what do you think about it?
I think this option should be removed, most users wouldn't use it.
>> BTW, the code (mlockall) added in the 3rd version works well in several machines here.
>> Should I keep it ?
>> With the codes, madvise(MADV_DONTNEED) will be failed in compress2 and the performance
>> is as expected in these machines.
That kludge isn't reasonable, it just change memory allocation pattern.
If you can't explain why it works well in theory, you should get rid of it.
Thanks
Atsushi Kumagai
>
>_______________________________________________
>kexec mailing list
>kexec@lists.infradead.org
>http://lists.infradead.org/mailman/listinfo/kexec
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
^ permalink raw reply [flat|nested] 18+ messages in thread