Kexec Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "\"Zhou, Wenjian/周文剑\"" <zhouwj-fnst@cn.fujitsu.com>
To: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Cc: "kexec@lists.infradead.org" <kexec@lists.infradead.org>
Subject: Re: [PATCH RFC 00/11] makedumpfile: parallel processing
Date: Thu, 10 Dec 2015 17:36:47 +0800	[thread overview]
Message-ID: <566947AF.5010800@cn.fujitsu.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE9701E0C319@BPXM01GP.gisp.nec.co.jp>

On 12/10/2015 04:14 PM, Atsushi Kumagai wrote:
>> Hello Kumagai,
>>
>> On 12/04/2015 10:30 AM, Atsushi Kumagai wrote:
>>> Hello, Zhou
>>>
>>>> On 12/02/2015 03:24 PM, Dave Young wrote:
>>>>> Hi,
>>>>>
>>>>> On 12/02/15 at 01:29pm, "Zhou, Wenjian/周文剑" wrote:
>>>>>> I think there is no problem if other test results are as expected.
>>>>>>
>>>>>> --num-threads mainly reduces the time of compressing.
>>>>>> So for lzo, it can't do much help at most of time.
>>>>>
>>>>> Seems the help of --num-threads does not say it exactly:
>>>>>
>>>>>      [--num-threads THREADNUM]:
>>>>>          Using multiple threads to read and compress data of each page in parallel.
>>>>>          And it will reduces time for saving DUMPFILE.
>>>>>          This feature only supports creating DUMPFILE in kdump-comressed format from
>>>>>          VMCORE in kdump-compressed format or elf format.
>>>>>
>>>>> Lzo is also a compress method, it should be mentioned that --num-threads only
>>>>> supports zlib compressed vmcore.
>>>>>
>>>>
>>>> Sorry, it seems that something I said is not so clear.
>>>> lzo is also supported. Since lzo compresses data at a high speed, the
>>>> improving of the performance is not so obvious at most of time.
>>>>
>>>>> Also worth to mention about the recommended -d value for this feature.
>>>>>
>>>>
>>>> Yes, I think it's worth. I forgot it.
>>>
>>> I saw your patch, but I think I should confirm what is the problem first.
>>>
>>>> However, when "-d 31" is specified, it will be worse.
>>>> Less than 50 buffers are used to cache the compressed page.
>>>> And even the page has been filtered, it will also take a buffer.
>>>> So if "-d 31" is specified, the filtered page will use a lot
>>>> of buffers. Then the page which needs to be compressed can't
>>>> be compressed parallel.
>>>
>>> Could you explain why compression will not be parallel in more detail ?
>>> Actually the buffers are used also for filtered pages, it sounds inefficient.
>>> However, I don't understand why it prevents parallel compression.
>>>
>>
>> Think about this, in a huge memory, most of the page will be filtered, and
>> we have 5 buffers.
>>
>> page1       page2      page3     page4     page5      page6       page7 .....
>> [buffer1]   [2]        [3]       [4]       [5]
>> unfiltered  filtered   filtered  filtered  filtered   unfiltered  filtered
>>
>> Since filtered page will take a buffer, when compressing page1,
>> page6 can't be compressed at the same time.
>> That why it will prevent parallel compression.
>
> Thanks for your explanation, I understand.
> This is just an issue of the current implementation, there is no
> reason to stand this restriction.
>
>>> Further, according to Chao's benchmark, there is a big performance
>>> degradation even if the number of thread is 1. (58s vs 240s)
>>> The current implementation seems to have some problems, we should
>>> solve them.
>>>
>>
>> If "-d 31" is specified, on the one hand we can't save time by compressing
>> parallel, on the other hand we will introduce some extra work by adding
>> "--num-threads". So it is obvious that it will have a performance degradation.
>
> Sure, there must be some overhead due to "some extra work"(e.g. exclusive lock),
> but "--num-threads=1 is 4 times slower than --num-threads=0" still sounds
> too slow, the degradation is too big to be called "some extra work".
>
> Both --num-threads=0 and --num-threads=1 are serial processing,
> the above "buffer fairness issue" will not be related to this degradation.
> What do you think what make this degradation ?
>

I can't get such result at this moment, so I can't do some further investigation
right now. I guess it may be caused by the underlying implementation of pthread.
I reviewed the test result of the patch v2 and found in different machines,
the results are quite different.

It seems that I can get almost the same result of Chao from "PRIMEQUEST 1800E".

###################################
- System: PRIMERGY RX300 S6
- CPU: Intel(R) Xeon(R) CPU x5660
- memory: 16GB
###################################
************ makedumpfile -d 7 ******************
                 core-data       0       256
         threads-num
-l
         0                       10      144
         4                       5       110
         8                       5       111
         12                      6       111

************ makedumpfile -d 31 ******************
                 core-data       0       256
         threads-num
-l
         0                       0       0
         4                       2       2
         8                       2       3
         12                      2       3

###################################
- System: PRIMEQUEST 1800E
- CPU: Intel(R) Xeon(R) CPU E7540
- memory: 32GB
###################################
************ makedumpfile -d 7 ******************
                 core-data        0       256
         threads-num
-l
         0                        34      270
         4                        63      154
         8                        64      131
         12                       65      159

************ makedumpfile -d 31 ******************
                 core-data        0       256
         threads-num
-l
         0                        2       1
         4                        48      48
         8                        48      49
         12                       49      50

>> I'm not so sure if it is a problem that the performance degradation is so big.
>> But I think if in other cases, it works as expected, this won't be a problem(
>> or a problem needs to be fixed), for the performance degradation existing
>> in theory.	
>>
>> Or the current implementation should be replaced by a new arithmetic.
>> For example:
>> We can add an array to record whether the page is filtered or not.
>> And only the unfiltered page will take the buffer.
>
> We should discuss how to implement new mechanism, I'll mention this later.
>
>> But I'm not sure if it is worth.
>> For "-l -d 31" is fast enough, the new arithmetic also can't do much help.
>
> Basically the faster, the better. There is no obvious target time.
> If there is room for improvement, we should do it.
>

Maybe we can improve the performance of "-c -d 31" in some case.

BTW, we can easily get the theoretical performance by using the "--split".

-- 
Thanks
Zhou



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2015-12-10  9:39 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-05  7:56 [PATCH RFC 00/11] makedumpfile: parallel processing Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 01/11] Add readpage_kdump_compressed_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 02/11] Add mappage_elf_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 03/11] Add readpage_elf_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 04/11] Add read_pfn_parallel Zhou Wenjian
2015-06-05  7:56 ` [PATCH RFC 05/11] Add function to initial bitmap for parallel use Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 06/11] Add filter_data_buffer_parallel Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 07/11] Add write_kdump_pages_parallel to allow parallel process Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 08/11] Add write_kdump_pages_parallel_cyclic to allow parallel process in cyclic_mode Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 09/11] Initial and free data used for parallel process Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 10/11] Make makedumpfile available to read and compress pages parallelly Zhou Wenjian
2015-06-05  7:57 ` [PATCH RFC 11/11] Add usage and manual about multiple threads process Zhou Wenjian
2015-06-08  3:55 ` [PATCH RFC 00/11] makedumpfile: parallel processing "Zhou, Wenjian/周文剑"
2015-12-01  8:39   ` Chao Fan
2015-12-02  5:29     ` "Zhou, Wenjian/周文剑"
2015-12-02  7:24       ` Dave Young
2015-12-02  7:38         ` "Zhou, Wenjian/周文剑"
2015-12-04  2:30           ` Atsushi Kumagai
2015-12-04  3:33             ` "Zhou, Wenjian/周文剑"
2015-12-04  8:56               ` Chao Fan
2015-12-07  1:09                 ` "Zhou, Wenjian/周文剑"
2015-12-10  8:14               ` Atsushi Kumagai
2015-12-10  9:36                 ` "Zhou, Wenjian/周文剑" [this message]
2015-12-10  9:58                   ` Chao Fan
2015-12-10 10:32                     ` "Zhou, Wenjian/周文剑"
2015-12-10 10:54                       ` Chao Fan
2015-12-22  8:32                         ` HATAYAMA Daisuke
2015-12-24  2:20                           ` Chao Fan
2015-12-24  3:22                             ` HATAYAMA Daisuke
2015-12-24  3:31                               ` Chao Fan
2015-12-24  3:50                                 ` HATAYAMA Daisuke
2015-12-24  6:02                                   ` Chao Fan
2015-12-24  7:22                                     ` HATAYAMA Daisuke
2015-12-24  8:20                                     ` Atsushi Kumagai
2015-12-24  9:04                                       ` Chao Fan
2015-12-14  8:26                   ` Atsushi Kumagai
2015-12-14  8:59                     ` "Zhou, Wenjian/周文剑"
2015-06-10  6:06 ` Atsushi Kumagai
2015-06-11  3:47   ` "Zhou, Wenjian/周文剑"
2015-06-15  1:59     ` qiaonuohan
2015-06-15  5:57       ` Atsushi Kumagai
2015-06-15  6:06         ` qiaonuohan
2015-06-15  6:07         ` qiaonuohan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=566947AF.5010800@cn.fujitsu.com \
    --to=zhouwj-fnst@cn.fujitsu.com \
    --cc=ats-kumagai@wm.jp.nec.com \
    --cc=kexec@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox