All of lore.kernel.org
 help / color / mirror / Atom feed
From: "\"Zhou, Wenjian/周文剑\"" <zhouwj-fnst@cn.fujitsu.com>
To: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>
Subject: Re: [PATCH RFC 00/11] makedumpfile: parallel processing
Date: Mon, 14 Dec 2015 16:59:10 +0800	[thread overview]
Message-ID: <566E84DE.2020804@cn.fujitsu.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE9701E0D0DA@BPXM01GP.gisp.nec.co.jp>

On 12/14/2015 04:26 PM, Atsushi Kumagai wrote:
>>>> Think about this, in a huge memory, most of the page will be filtered, and
>>>> we have 5 buffers.
>>>>
>>>> page1       page2      page3     page4     page5      page6       page7 .....
>>>> [buffer1]   [2]        [3]       [4]       [5]
>>>> unfiltered  filtered   filtered  filtered  filtered   unfiltered  filtered
>>>>
>>>> Since filtered page will take a buffer, when compressing page1,
>>>> page6 can't be compressed at the same time.
>>>> That why it will prevent parallel compression.
>>>
>>> Thanks for your explanation, I understand.
>>> This is just an issue of the current implementation, there is no
>>> reason to stand this restriction.
>>>
>>>>> Further, according to Chao's benchmark, there is a big performance
>>>>> degradation even if the number of thread is 1. (58s vs 240s)
>>>>> The current implementation seems to have some problems, we should
>>>>> solve them.
>>>>>
>>>>
>>>> If "-d 31" is specified, on the one hand we can't save time by compressing
>>>> parallel, on the other hand we will introduce some extra work by adding
>>>> "--num-threads". So it is obvious that it will have a performance degradation.
>>>
>>> Sure, there must be some overhead due to "some extra work"(e.g. exclusive lock),
>>> but "--num-threads=1 is 4 times slower than --num-threads=0" still sounds
>>> too slow, the degradation is too big to be called "some extra work".
>>>
>>> Both --num-threads=0 and --num-threads=1 are serial processing,
>>> the above "buffer fairness issue" will not be related to this degradation.
>>> What do you think what make this degradation ?
>>>
>>
>> I can't get such result at this moment, so I can't do some further investigation
>> right now. I guess it may be caused by the underlying implementation of pthread.
>> I reviewed the test result of the patch v2 and found in different machines,
>> the results are quite different.
>
> Unluckily, I also can't reproduce such big degradation.
> According to the Chao's verification, this issue seems different form
> the "too many page fault issue" that we solved.
> I have no ideas, but at least I want to confirm whether this issue
> is avoidable or not.
>
>> It seems that I can get almost the same result of Chao from "PRIMEQUEST 1800E".
>>
>> ###################################
>> - System: PRIMERGY RX300 S6
>> - CPU: Intel(R) Xeon(R) CPU x5660
>> - memory: 16GB
>> ###################################
>> ************ makedumpfile -d 7 ******************
>>                  core-data       0       256
>>          threads-num
>> -l
>>          0                       10      144
>>          4                       5       110
>>          8                       5       111
>>          12                      6       111
>>
>> ************ makedumpfile -d 31 ******************
>>                  core-data       0       256
>>          threads-num
>> -l
>>          0                       0       0
>>          4                       2       2
>>          8                       2       3
>>          12                      2       3
>>
>> ###################################
>> - System: PRIMEQUEST 1800E
>> - CPU: Intel(R) Xeon(R) CPU E7540
>> - memory: 32GB
>> ###################################
>> ************ makedumpfile -d 7 ******************
>>                  core-data        0       256
>>          threads-num
>> -l
>>          0                        34      270
>>          4                        63      154
>>          8                        64      131
>>          12                       65      159
>>
>> ************ makedumpfile -d 31 ******************
>>                  core-data        0       256
>>          threads-num
>> -l
>>          0                        2       1
>>          4                        48      48
>>          8                        48      49
>>          12                       49      50
>>
>>>> I'm not so sure if it is a problem that the performance degradation is so big.
>>>> But I think if in other cases, it works as expected, this won't be a problem(
>>>> or a problem needs to be fixed), for the performance degradation existing
>>>> in theory.
>>>>
>>>> Or the current implementation should be replaced by a new arithmetic.
>>>> For example:
>>>> We can add an array to record whether the page is filtered or not.
>>>> And only the unfiltered page will take the buffer.
>>>
>>> We should discuss how to implement new mechanism, I'll mention this later.
>>>
>>>> But I'm not sure if it is worth.
>>>> For "-l -d 31" is fast enough, the new arithmetic also can't do much help.
>>>
>>> Basically the faster, the better. There is no obvious target time.
>>> If there is room for improvement, we should do it.
>>>
>>
>> Maybe we can improve the performance of "-c -d 31" in some case.
>
> Yes, the buffer is used for -c, -l and -p, not only for -l.
> It would be useful to improve that.
>
>> BTW, we can easily get the theoretical performance by using the "--split".
>
> Are you sure ? You persuaded me in the thread below:
>
>    http://lists.infradead.org/pipermail/kexec/2015-June/013881.html
>
> --num-threads is orthogonal to --split, it's better to use the both
> option since they try to solve different bottlenecks.
> That's why I decided to merge your multi thread feature.
>
> However, what you said sounds --split is a superset of --num-threads.
> You don't need the multi thread feature ?
>

I just mean the performance.
There is no doubt that we will use multi-threads in --split in the future.

But as we all known, threads and processes have some common characters.
And in makedumpfile, if we use "--split core1 core2 core3 core4" and
"--num-threads 4" separately, the spent time should not be quite different.

Since the logic of "--split" is more simple, if we can't improve the performance
of "-l -d 31" by "--split", we also don't have much chance to do it by "--num-threads".

I just mean that.
It is of course that --split is not a super set of --num-threads.

-- 
Thanks
Zhou



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2015-12-14  9:01 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-05  7:56 [PATCH RFC 00/11] makedumpfile: parallel processing Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 01/11] Add readpage_kdump_compressed_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 02/11] Add mappage_elf_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 03/11] Add readpage_elf_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 04/11] Add read_pfn_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 05/11] Add function to initial bitmap for parallel use Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 06/11] Add filter_data_buffer_parallel Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 07/11] Add write_kdump_pages_parallel to allow parallel process Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 08/11] Add write_kdump_pages_parallel_cyclic to allow parallel process in cyclic_mode Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 09/11] Initial and free data used for parallel process Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 10/11] Make makedumpfile available to read and compress pages parallelly Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 11/11] Add usage and manual about multiple threads process Zhou Wenjian
2015-06-08  3:55 ` [PATCH RFC 00/11] makedumpfile: parallel processing "Zhou, Wenjian/周文剑"
2015-12-01  8:39   ` Chao Fan
2015-12-02  5:29     ` "Zhou, Wenjian/周文剑"
2015-12-02  7:24       ` Dave Young
2015-12-02  7:38         ` "Zhou, Wenjian/周文剑"
2015-12-04  2:30           ` Atsushi Kumagai
2015-12-04  3:33             ` "Zhou, Wenjian/周文剑"
2015-12-04  8:56               ` Chao Fan
2015-12-07  1:09                 ` "Zhou, Wenjian/周文剑"
2015-12-10  8:14               ` Atsushi Kumagai
2015-12-10  9:36                 ` "Zhou, Wenjian/周文剑"
2015-12-10  9:58                   ` Chao Fan
2015-12-10 10:32                     ` "Zhou, Wenjian/周文剑"
2015-12-10 10:54                       ` Chao Fan
2015-12-22  8:32                         ` HATAYAMA Daisuke
2015-12-24  2:20                           ` Chao Fan
2015-12-24  3:22                             ` HATAYAMA Daisuke
2015-12-24  3:31                               ` Chao Fan
2015-12-24  3:50                                 ` HATAYAMA Daisuke
2015-12-24  6:02                                   ` Chao Fan
2015-12-24  7:22                                     ` HATAYAMA Daisuke
2015-12-24  8:20                                     ` Atsushi Kumagai
2015-12-24  9:04                                       ` Chao Fan
2015-12-14  8:26                   ` Atsushi Kumagai
2015-12-14  8:59                     ` "Zhou, Wenjian/周文剑" [this message]
2015-06-10  6:06 ` Atsushi Kumagai
2015-06-11  3:47   ` "Zhou, Wenjian/周文剑"
2015-06-15  1:59     ` qiaonuohan
2015-06-15  5:57       ` Atsushi Kumagai
2015-06-15  6:06         ` qiaonuohan
2015-06-15  6:07         ` qiaonuohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566E84DE.2020804@cn.fujitsu.com \
    --to=zhouwj-fnst@cn.fujitsu.com \
    --cc=ats-kumagai@wm.jp.nec.com \
    --cc=kexec@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.