From: Eric DeVolder <eric.devolder@oracle.com>
To: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>
Cc: "daniel.kiper@oracle.com" <daniel.kiper@oracle.com>,
"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>
Subject: Re: [makedumpfile PATCH] Allow PFN_EXCLUDED to be tunable via command line option --exclude-threshold
Date: Tue, 11 Jul 2017 14:42:12 -0500 [thread overview]
Message-ID: <a32867ed-e9de-bb0e-907a-9a94630760aa@oracle.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE9701EEFA89@BPXM01GP.gisp.nec.co.jp>
Atsushi,
Please see response below!
eric
On 07/11/2017 02:43 AM, Atsushi Kumagai wrote:
> Hello Eric,
>
>>> On 07/07/2017 04:09 AM, Atsushi Kumagai wrote:
>>>>> The PFN_EXCLUDED value is used to control at which point a run of
>>>>> zeros in the bitmap (zeros denote excluded pages) is large enough
>>>>> to warrant truncating the current output segment and to create a
>>>>> new output segment (containing non-excluded pages), in an ELF dump.
>>>>>
>>>>> If the run is smaller than PFN_EXCLUDED, then those excluded pages
>>>>> are still output in the ELF dump, for the current output segment.
>>>>>
>>>>> By using smaller values of PFN_EXCLUDED, the resulting dump file
>>>>> size can be made smaller by actually removing more excluded pages
>>>>> from the resulting dump file.
>>>>>
>>>>> This patch adds the command line option --exclude-threshold=<value>
>>>>> to indicate the threshold. The default is 256, the legacy value
>>>>> of PFN_EXCLUDED. The smallest value permitted is 1.
>>>>>
>>>>> Using an existing vmcore, this was tested by the following:
>>>>>
>>>>> % makedumpfile -E -d31 --exclude-threshold=256 -x vmlinux vmcore
>>>>> newvmcore256
>>>>> % makedumpfile -E -d31 --exclude-threshold=4 -x vmlinux vmcore
>>>>> newvmcore4
>>>>>
>>>>> I utilize -d31 in order to exclude as many page types as possible,
>>>>> resulting in a [significantly] smaller file sizes than the original
>>>>> vmcore.
>>>>>
>>>>> -rwxrwx--- 1 edevolde edevolde 4034564096 Jun 27 10:24 vmcore
>>>>> -rw------- 1 edevolde edevolde 119808156 Jul 6 13:01 newvmcore256
>>>>> -rw------- 1 edevolde edevolde 100811276 Jul 6 13:08 newvmcore4
>>>>>
>>>>> The use of smaller value of PFN_EXCLUDED increases the number of
>>>>> output segments (the 'Number of program headers' in the readelf
>>>>> output) in the ELF dump file.
>>>>
>>>> How will you tune the value ? I'm not sure what is the benefit of the
>>>> tunable PFN_EXCLUDED. If there is no regression caused by too many
>>>> PT_LOAD
>>>> entries, I think we can decide a concrete PFN_EXCLUDED.
>>>
>>> Allow me note two things prior to addressing the question.
>>>
>>> Note that the value for PFN_EXCLUDED really should be in the range:
>>>
>>> 1 <= PFN_EXCLUDED <= NUM_PAGES(largest segment)
>>>
>>> but that values larger than NUM_PAGES(largest segment) behave the same
>>> as NUM_PAGES(largest segment) and simply prevent makedumpfile from ever
>>> omitting excluded pages from the dump file.
>>>
>>> Also note that the ELF header allows for a 16-bit e_phnum value for the
>>> number of segments in the dump file. As it stands today, I doubt that
>>> anybody has come close to reaching 65535 segments, but the combination
>>> of larger and larger memories as well as the work we (Oracle) are doing
>>> to further enhance the capabilities of makedumpfile, I believe we will
>>> start to challenge this 65535 number.
>
> I overlooked the limitation of the number of segments, so I considered
> only "The first benefit" you said below.
>
>>> The ability to tune PFN_EXCLUDED allows one to minimize file size while
>>> still staying within ELF boundaries.
>>>
>>> There are two ways in which have PFN_EXCLUDED as a tunable parameter
>>> benefits the user.
>>>
>>> The first benefit is, when making PFN_EXCLUDED smaller, makedumpfile has
>>> more opportunities to NOT write excluded pages to the resulting dump
>>> file, thus obtaining a smaller overall dump file size. And since a
>>> PT_LOAD header is smaller than a page, this penalty (of more segments)
>>> will always result in a smaller file size. (In the example I cite the
>>> dump file was 18MB smaller with a PFN_EXCLUDED value of 4 than default
>>> 256, in spite of increasing the number of segments from 6 to 244).
>>>
>>> The second benefit is, when enabling PFN_EXCLUDED to become larger, it
>>> allows makedumpfile to continue to generate valid ELF dump files in the
>>> presence of larger and larger memory systems. Generally speaking, the
>>> goal is to minimize the size of dump files via the exclusion of
>>> uninteresting pages (ie zero, free, etc), especially as the size of
>>> memory continues to grow and grow. As the memory increases, there are
>>> more and more of these uninteresting pages, and more opportunities for
>>> makedumpfile to omit them (even at the current PFN_EXCLUDED value of
>>> 256). Furthermore, we are working on additional page exclusion
>>> strategies that will drive more and more opportunities for makedumpfile
>>> to omit these pages from the dump file. And as makedumpfile omits more
>>> and more pages from the dump file, that increases the number of segments
>>> needed.
>>>
>>> By enabling a user to tune the value of PFN_EXCLUDED, we provide an
>>> additional mechanism to balance the size of the ELF dump file with
>>> respect to the size of memory.
>>
>> It occurred to me that offering the option "--exclude-threshold=auto"
>> whereby a binary search on the second bitmap in the function
>> get_loads_dumpfile_cyclic() to determine the optimum value of
>> PFN_EXCLUDED (optimum here meaning the smallest possible value while
>> still staying within 65535 segments, which would yield the smallest
>> possible dump file size for the given constraints) would be an excellent
>> feature to have?
>
> I think the "auto" is necessary for --exclude-threshold, the optimum
> value should be calculated automatically. Otherwise, it imposes trial-and-error
> on users every time, it doesn't sound practical. IOW, this patch is
> unacceptable if there is no mechanism to support users.
> So now, my only concern for this option is the processing time of the
> binary search.
OK, so the idea of "tuning" the value of PFN_EXCLUDED is agree-able,
great! I will work on the binary search and report back with
measurements on the processing time of 'crash'. From there we can
determine if benefit is worthwhile.
Regards,
eric
>
> [snip]
>>>>> And with a larger number of segments, loading both vmcore and newvmcore4
>>>>> into 'crash' resulted in identical outputs when run with the dmesg, ps,
>>>>> files, mount, and net sub-commands.
>>>>
>>>> What about the processing speed of crash, is there no slow down ?
>>>
>>> I did not observe a noticeable change in processing speed of crash.
>
> Good, it would be better to be represented by actual measured results.
>
> Thanks,
> Atsushi Kumagai
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec
prev parent reply other threads:[~2017-07-11 19:42 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-06 19:21 [makedumpfile PATCH] Allow PFN_EXCLUDED to be tunable via command line option --exclude-threshold Eric DeVolder
2017-07-07 9:09 ` Atsushi Kumagai
2017-07-07 17:53 ` Eric DeVolder
2017-07-10 14:51 ` Eric DeVolder
2017-07-11 7:43 ` Atsushi Kumagai
2017-07-11 19:42 ` Eric DeVolder [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a32867ed-e9de-bb0e-907a-9a94630760aa@oracle.com \
--to=eric.devolder@oracle.com \
--cc=ats-kumagai@wm.jp.nec.com \
--cc=daniel.kiper@oracle.com \
--cc=kexec@lists.infradead.org \
--cc=konrad.wilk@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox