Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "\"Zhou, Wenjian/周文剑\"" <zhouwj-fnst@cn.fujitsu.com>
To: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>
Subject: Re: [PATCH v3 00/10] makedumpfile: parallel processing
Date: Thu, 23 Jul 2015 14:39:25 +0800	[thread overview]
Message-ID: <55B08C1D.7010505@cn.fujitsu.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE9701DD0947@BPXM01GP.gisp.nec.co.jp>

On 07/23/2015 02:20 PM, Atsushi Kumagai wrote:
>> Hello Kumagai,
>>
>> The PATCH v3 has improved the performance.
>> The performance degradation in PATCH v2 mainly caused by the page_fault
>> produced by the function compress2().
>>
>> I wrote some codes to test the performance of compress2. It almost costs
>> the same time and produces the same amount of page_fault as executing compress2
>> in thread.
>>
>> To reduce page_faults, I have to do the following in kdump_thread_function_cyclic().
>>
>> +	/*
>> +	 * lock memory to reduce page_faults by compress2()
>> +	 */
>> +	void *temp = malloc(1);
>> +	memset(temp, 0, 1);
>> +	mlockall(MCL_CURRENT);
>> +	free(temp);
>> +
>>
>> With this, using a thread or not almost has the same performance.
>
> Hmm... I can't get good results with this patch, many page faults still
> occur. I guess mlock will change when page faults occur, but will not
> change the total number of page faults.
> Could you explain why compress2() causes many page faults only in thread,
> then I may understand why this patch is meaningful.
>

Actually, it will also cause so much page faults even not in thread, if
info->bitmap2 is not freed in makedumpfile.

I wrote some codes to test the performance of compress2().

<cut>
buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
     compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>

The codes almost like this.
It will cause much page faults.

But if the codes turn to be the following, it will be much better.

<cut>
temp = malloc(TEMP_SIZE);
memset(temp, 0, TEMP_SIZE);
free(temp);

buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
     compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>

TEMP_SIZE must be large enough.
(larger than 135097 will work,in my machine)


If in thread, the following codes can reduce the page faults.

<cut>
temp = malloc(1);
memset(temp, 0, 1);
mlockall(MCL_CURRENT);
free(temp);

buf = malloc(PAGE_SIZE);
bufout = malloc(SIZE_OUT);
memset(buf, 1, PAGE_SIZE / 2);
while (1)
     compress2(bufout, &size_out, buf, PAGE_SIZE, Z_BEST_SPEED);
<cut>

I haven't known why.

-- 
Thanks
Zhou Wenjian

>
> Thanks
> Atsushi Kumagai
>
>> In our machine, I can get the same result as the following with PATCH v2.
>>> Test2-1:
>>>    | threads | compress time | exec time |
>>>    |    1    |     76.12     |   82.13   |
>   >
>>> Test2-2:
>>>    | threads | compress time | exec time |
>>>    |    1    |     41.97     |   51.46   |
>>
>> I test the new patch set in the machine, and below is the results.
>>
>> PATCH V2:
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 0 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               158     1505    2119    2129    1707    1483    1440    1273
>>          4                               207     589     672     673     636     564     536     514
>>          8                               176     327     377     387     367     336     314     291
>>          12                              191     272     295     306     288     259     257     240
>>
>> ************ makedumpfile -d 7 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               154     1508    2089    2133    1792    1660    1462    1312
>>          4                               203     594     684     701     627     592     535     503
>>          8                               172     326     377     393     366     334     313     286
>>          12                              182     273     295     308     283     258     249     237
>>
>>
>>
>> PATCH v3:
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 0 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               192     1488    1830
>>          4                               62      393     477
>>          8                               78      211     258
>>
>> ************ makedumpfile -d 7 ******************
>>                  core-data               0       256     512     768     1024    1280    1536    1792
>>          threads-num
>> -c
>>          0                               197     1475    1815
>>          4                               62      396     482
>>          8                               78      209     252
>>
>>
>> --
>> Thanks
>> Zhou Wenjian
>>
>> On 07/21/2015 02:29 PM, Zhou Wenjian wrote:
>>> This patch set implements parallel processing by means of multiple threads.
>>> With this patch set, it is available to use multiple threads to read
>>> and compress pages. This parallel process will save time.
>>> This feature only supports creating dumpfile in kdump-compressed format from
>>> vmcore in kdump-compressed format or elf format. Currently, sadump and
>>>    xen kdump are not supported.
>>>
>>> Qiao Nuohan (10):
>>>     Add readpage_kdump_compressed_parallel
>>>     Add mappage_elf_parallel
>>>     Add readpage_elf_parallel
>>>     Add read_pfn_parallel
>>>     Add function to initial bitmap for parallel use
>>>     Add filter_data_buffer_parallel
>>>     Add write_kdump_pages_parallel to allow parallel process
>>>     Initial and free data used for parallel process
>>>     Make makedumpfile available to read and compress pages parallelly
>>>     Add usage and manual about multiple threads process
>>>
>>>    Makefile       |    2 +
>>>    erase_info.c   |   29 ++-
>>>    erase_info.h   |    2 +
>>>    makedumpfile.8 |   24 ++
>>>    makedumpfile.c | 1095 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>    makedumpfile.h |   80 ++++
>>>    print_info.c   |   16 +
>>>    7 files changed, 1245 insertions(+), 3 deletions(-)
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
>>>



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2015-07-23  6:40 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-21  6:29 [PATCH v3 00/10] makedumpfile: parallel processing Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 01/10] Add readpage_kdump_compressed_parallel Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 02/10] Add mappage_elf_parallel Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 03/10] Add readpage_elf_parallel Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 04/10] Add read_pfn_parallel Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 05/10] Add function to initial bitmap for parallel use Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 06/10] Add filter_data_buffer_parallel Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 07/10] Add write_kdump_pages_parallel to allow parallel process Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 08/10] Initial and free data used for " Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 09/10] Make makedumpfile available to read and compress pages parallelly Zhou Wenjian
2015-07-21  6:29 ` [PATCH v3 10/10] Add usage and manual about multiple threads process Zhou Wenjian
2015-07-21  7:10 ` [PATCH v3 00/10] makedumpfile: parallel processing "Zhou, Wenjian/周文剑"
2015-07-23  6:20   ` Atsushi Kumagai
2015-07-23  6:39     ` "Zhou, Wenjian/周文剑" [this message]
2015-07-31  8:27       ` Atsushi Kumagai
2015-07-31  9:35         ` "Zhou, Wenjian/周文剑"
2015-08-05  2:46           ` "Zhou, Wenjian/周文剑"
2015-08-06  2:46             ` Atsushi Kumagai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B08C1D.7010505@cn.fujitsu.com \
    --to=zhouwj-fnst@cn.fujitsu.com \
    --cc=ats-kumagai@wm.jp.nec.com \
    --cc=kexec@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox