From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: "bhe@redhat.com" <bhe@redhat.com>,
"tom.vaden@hp.com" <tom.vaden@hp.com>,
"kexec@lists.infradead.org" <kexec@lists.infradead.org>,
"ptesarik@suse.cz" <ptesarik@suse.cz>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"lisa.mitchell@hp.com" <lisa.mitchell@hp.com>,
"vgoyal@redhat.com" <vgoyal@redhat.com>,
"anderson@redhat.com" <anderson@redhat.com>,
"ebiederm@xmission.com" <ebiederm@xmission.com>,
"jingbai.ma@hp.com" <jingbai.ma@hp.com>
Subject: Re: [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump
Date: Thu, 28 Nov 2013 16:48:47 +0900 [thread overview]
Message-ID: <5296F55F.30403@jp.fujitsu.com> (raw)
In-Reply-To: <0910DD04CBD6DE4193FCF86B9C00BE971C7EC5@BPXM01GP.gisp.nec.co.jp>
(2013/11/28 16:08), Atsushi Kumagai wrote:
> On 2013/11/22 16:18:20, kexec <kexec-bounces@lists.infradead.org> wrote:
>> (2013/11/07 9:54), HATAYAMA Daisuke wrote:
>>> (2013/11/06 11:21), Atsushi Kumagai wrote:
>>>> (2013/11/06 5:27), Vivek Goyal wrote:
>>>>> On Tue, Nov 05, 2013 at 09:45:32PM +0800, Jingbai Ma wrote:
>>>>>> This patch set intend to exclude unnecessary hugepages from vmcore dump file.
>>>>>>
>>>>>> This patch requires the kernel patch to export necessary data structures into
>>>>>> vmcore: "kexec: export hugepage data structure into vmcoreinfo"
>>>>>> http://lists.infradead.org/pipermail/kexec/2013-November/009997.html
>>>>>>
>>>>>> This patch introduce two new dump levels 32 and 64 to exclude all unused and
>>>>>> active hugepages. The level to exclude all unnecessary pages will be 127 now.
>>>>>
>>>>> Interesting. Why hugepages should be treated any differentely than normal
>>>>> pages?
>>>>>
>>>>> If user asked to filter out free page, then it should be filtered and
>>>>> it should not matter whether it is a huge page or not?
>>>>
>>>> I'm making a RFC patch of hugepages filtering based on such policy.
>>>>
>>>> I attach the prototype version.
>>>> It's able to filter out also THPs, and suitable for cyclic processing
>>>> because it depends on mem_map and looking up it can be divided into
>>>> cycles. This is the same idea as page_is_buddy().
>>>>
>>>> So I think it's better.
>>>>
>>>
>>>> @@ -4506,14 +4583,49 @@ __exclude_unnecessary_pages(unsigned long mem_map,
>>>> && !isAnon(mapping)) {
>>>> if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
>>>> pfn_cache_private++;
>>>> + /*
>>>> + * NOTE: If THP for cache is introduced, the check for
>>>> + * compound pages is needed here.
>>>> + */
>>>> }
>>>> /*
>>>> * Exclude the data page of the user process.
>>>> */
>>>> - else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
>>>> - && isAnon(mapping)) {
>>>> - if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
>>>> - pfn_user++;
>>>> + else if (info->dump_level & DL_EXCLUDE_USER_DATA) {
>>>> + /*
>>>> + * Exclude the anonnymous pages as user pages.
>>>> + */
>>>> + if (isAnon(mapping)) {
>>>> + if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
>>>> + pfn_user++;
>>>> +
>>>> + /*
>>>> + * Check the compound page
>>>> + */
>>>> + if (page_is_hugepage(flags) && compound_order > 0) {
>>>> + int i, nr_pages = 1 << compound_order;
>>>> +
>>>> + for (i = 1; i < nr_pages; ++i) {
>>>> + if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
>>>> + pfn_user++;
>>>> + }
>>>> + pfn += nr_pages - 2;
>>>> + mem_map += (nr_pages - 1) * SIZE(page);
>>>> + }
>>>> + }
>>>> + /*
>>>> + * Exclude the hugetlbfs pages as user pages.
>>>> + */
>>>> + else if (hugetlb_dtor == SYMBOL(free_huge_page)) {
>>>> + int i, nr_pages = 1 << compound_order;
>>>> +
>>>> + for (i = 0; i < nr_pages; ++i) {
>>>> + if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
>>>> + pfn_user++;
>>>> + }
>>>> + pfn += nr_pages - 1;
>>>> + mem_map += (nr_pages - 1) * SIZE(page);
>>>> + }
>>>> }
>>>> /*
>>>> * Exclude the hwpoison page.
>>>
>>> I'm concerned about the case that filtering is not performed to part of mem_map
>>> entries not belonging to the current cyclic range.
>>>
>>> If maximum value of compound_order is larger than maximum value of
>>> CONFIG_FORCE_MAX_ZONEORDER, which makedumpfile obtains by ARRAY_LENGTH(zone.free_area),
>>> it's necessary to align info->bufsize_cyclic with larger one in
>>> check_cyclic_buffer_overrun().
>>>
>>
>> ping, in case you overlooked this...
>
> Sorry for the delayed response, I prioritize the release of v1.5.5 now.
>
> Thanks for your advice, check_cyclic_buffer_overrun() should be fixed
> as you said. In addition, I'm considering other way to address such case,
> that is to bring the number of "overflowed pages" to the next cycle and
> exclude them at the top of __exclude_unnecessary_pages() like below:
>
> /*
> * The pages which should be excluded still remain.
> */
> if (remainder >= 1) {
> int i;
> unsigned long tmp;
> for (i = 0; i < remainder; ++i) {
> if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i)) {
> pfn_user++;
> tmp++;
> }
> }
> pfn += tmp;
> remainder -= tmp;
> mem_map += (tmp - 1) * SIZE(page);
> continue;
> }
>
> If this way works well, then aligning info->buf_size_cyclic will be
> unnecessary.
>
I selected the current implementation of changing cyclic buffer size becuase
I thought it was simpler than carrying over remaining filtered pages to next cycle
in that there was no need to add extra code in filtering processing.
I guess the reason why you think this is better now is how to detect maximum order of
huge page is hard in some way, right?
--
Thanks.
HATAYAMA, Daisuke
next prev parent reply other threads:[~2013-11-28 7:49 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-05 13:45 [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump Jingbai Ma
2013-11-05 13:45 ` [PATCH 1/3] makedumpfile: hugepage filtering: add hugepage filtering functions Jingbai Ma
2013-11-05 13:45 ` [PATCH 2/3] makedumpfile: hugepage filtering: add excluding hugepage messages Jingbai Ma
2013-11-05 13:46 ` [PATCH 3/3] makedumpfile: hugepage filtering: add new dump levels for manual page Jingbai Ma
2013-11-05 20:26 ` [PATCH 0/3] makedumpfile: hugepage filtering for vmcore dump Vivek Goyal
2013-11-06 1:47 ` Jingbai Ma
2013-11-06 1:53 ` Vivek Goyal
2013-11-06 2:21 ` Atsushi Kumagai
2013-11-06 14:23 ` Vivek Goyal
2013-11-07 8:57 ` Jingbai Ma
2013-11-08 5:12 ` Atsushi Kumagai
2013-11-08 5:21 ` HATAYAMA Daisuke
2013-11-08 5:27 ` Jingbai Ma
2013-11-11 9:06 ` Petr Tesarik
2013-11-07 0:54 ` HATAYAMA Daisuke
2013-11-22 7:16 ` HATAYAMA Daisuke
2013-11-28 7:08 ` Atsushi Kumagai
2013-11-28 7:48 ` HATAYAMA Daisuke [this message]
-- strict thread matches above, loose matches on Subject: below --
2013-11-29 3:02 Atsushi Kumagai
2013-11-29 3:21 ` HATAYAMA Daisuke
2013-11-29 4:23 ` Atsushi Kumagai
2013-11-29 4:56 ` HATAYAMA Daisuke
2013-12-03 8:05 Atsushi Kumagai
2013-12-03 9:05 ` HATAYAMA Daisuke
2013-12-04 6:08 ` Atsushi Kumagai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5296F55F.30403@jp.fujitsu.com \
--to=d.hatayama@jp.fujitsu.com \
--cc=anderson@redhat.com \
--cc=bhe@redhat.com \
--cc=ebiederm@xmission.com \
--cc=jingbai.ma@hp.com \
--cc=kexec@lists.infradead.org \
--cc=kumagai-atsushi@mxc.nes.nec.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=lisa.mitchell@hp.com \
--cc=ptesarik@suse.cz \
--cc=tom.vaden@hp.com \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox