public inbox for kexec@lists.infradead.org
 help / color / mirror / Atom feed
From: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
To: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: kexec@lists.infradead.org, cpw@sgi.com
Subject: Re: [PATCH] makedumpfile: Petr Tesarik's hugepage filtering
Date: Fri, 24 May 2013 11:24:23 +0900	[thread overview]
Message-ID: <519ECF57.1080600@jp.fujitsu.com> (raw)
In-Reply-To: <20130524101121.58d765fdfd5800dee0f1b951@mxc.nes.nec.co.jp>

(2013/05/24 10:11), Atsushi Kumagai wrote:
> On Wed, 15 May 2013 13:43:59 -0500
> Cliff Wickman <cpw@sgi.com> wrote:

>>   	/*
>>   	 * The bitmap buffer is not dirty, and it is not necessary
>>   	 * to write out it.
>> @@ -4345,7 +4386,7 @@ __exclude_unnecessary_pages(unsigned lon
>>   		 * Exclude the data page of the user process.
>>   		 */
>>   		else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
>> -		    && isAnon(mapping)) {
>> +		    && (isAnon(mapping) || isHugePage(flags))) {
>>   			if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
>>   				pfn_user++;
>
> This code will filter out both transparent huge pages and hugetlbfs pages.
> But I think hugetlbfs pages shouldn't be filtered out as anonymous pages
> because anonymous pages and hugetlbfs pages are certainly different kind
> of pages.
>
> Therefore, I plan to add a new dump level to filter out hugetlbfs pages.
> My basic idea is like below:
>
>
>   		 * Exclude the data page of the user process.
>   		 */
>                  else if ((info->dump_level & DL_EXCLUDE_USER_DATA)
>                      && isAnon(mapping)) {
>                          if (clear_bit_on_2nd_bitmap_for_kernel(pfn))
>                                  pfn_user++;
> +
> +                       if (isPageHead(flags)) {
> +                               int i, nr_pages = 1 << compound_order;
> +
> +                               for (i = 1; i < nr_pages; ++i) {
> +                                       if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
> +                                               pfn_user++;
> +                               }
> +                               pfn += nr_pages - 2;
> +                               mem_map += (nr_pages - 1) * SIZE(page);
> +                       }

This seems correct. We don't know when and where page table gets remapped by huge pages
transparently. If user-space and transparent huge pages are filtered separately, data
larger than size of huge table, 2MiB or 1GiB on x86, can be corrupted. Consider the case
that 2MiB + 4KiB data is allocated on 2MiB boundary and only the first 2MiB area is
remapped by THP. We should filter them together.

> +               }
> +               /*
> +                * Exclude the hugetlbfs pages.
> +                */
> +               else if ((info->dump_level & DL_EXCLUDE_HUGETLBFS_DATA)
> +                        && (!isAnon(mapping) && isPageHead(flags))) {
> +                       int i, nr_pages = 1 << compound_order;
> +
> +                       for (i = 0; i < nr_pages; ++i) {
> +                               if (clear_bit_on_2nd_bitmap_for_kernel(pfn + i))
> +                                       pfn_hugetlb++;
> +                       }
> +                       pfn += nr_pages - 1;
> +                       mem_map += (nr_pages - 1) * SIZE(page);
>                  }
>                  /*
>                   * Exclude the hwpoison page.
>

hugetlbfs has no such risk, so this seems basically OK to me.

Also, hugetlbfs pages are "user data". They should get filtered when user data dump level
is specified. So how about:

       Dump  |  zero   without  with     user    free   hugetlbfs
       Level |  page   private  private  data    page   page
      -------+-----------------------------------------------------
          0  |
          1  |   X
          2  |           X
          4  |           X        X
          8  |                            X              X
         16  |                                    X
         32  |                                           X
         63  |   X       X        X       X       X      X

On dump level 8, hugetlbfs pages also get filtered.

>
> But before that, I would like to hear the opinions about the new dump level
> to make sure that the level is worthy or not.
>



-- 
Thanks.
HATAYAMA, Daisuke


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2013-05-24  2:25 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1UcggB-0005gy-4i@eag09.americas.sgi.com>
2013-05-24  1:11 ` [PATCH] makedumpfile: Petr Tesarik's hugepage filtering Atsushi Kumagai
2013-05-24  2:24   ` HATAYAMA Daisuke [this message]
2013-05-24  2:40     ` HATAYAMA Daisuke
2013-05-27 10:56       ` Atsushi Kumagai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=519ECF57.1080600@jp.fujitsu.com \
    --to=d.hatayama@jp.fujitsu.com \
    --cc=cpw@sgi.com \
    --cc=kexec@lists.infradead.org \
    --cc=kumagai-atsushi@mxc.nes.nec.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox