linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rui Teng <rui.teng@linux.vnet.ibm.com>
To: Dave Hansen <dave.hansen@linux.intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Michal Hocko <mhocko@suse.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	Santhosh G <santhog4@in.ibm.com>
Subject: Re: [PATCH] memory-hotplug: Fix bad area access on dissolve_free_huge_pages()
Date: Tue, 20 Sep 2016 23:52:25 +0800	[thread overview]
Message-ID: <fc05ee3c-097f-709b-7484-1cadc9f3ce22@linux.vnet.ibm.com> (raw)
In-Reply-To: <57E14D64.6090609@linux.intel.com>

On 9/20/16 10:53 PM, Dave Hansen wrote:
> On 09/20/2016 07:45 AM, Rui Teng wrote:
>> On 9/17/16 12:25 AM, Dave Hansen wrote:
>>>
>>> That's an interesting data point, but it still doesn't quite explain
>>> what is going on.
>>>
>>> It seems like there might be parts of gigantic pages that have
>>> PageHuge() set on tail pages, while other parts don't.  If that's true,
>>> we have another bug and your patch just papers over the issue.
>>>
>>> I think you really need to find the root cause before we apply this
>>> patch.
>>>
>> The root cause is the test scripts(tools/testing/selftests/memory-
>> hotplug/mem-on-off-test.sh) changes online/offline status on memory
>> blocks other than page header. It will *randomly* select 10% memory
>> blocks from /sys/devices/system/memory/memory*, and change their
>> online/offline status.
>
> Ahh, that does explain it!  Thanks for digging into that!
>
>> That's why we need a PageHead() check now, and why this problem does
>> not happened on systems with smaller huge page such as 16M.
>>
>> As far as the PageHuge() set, I think PageHuge() will return true for
>> all tail pages. Because it will get the compound_head for tail page,
>> and then get its huge page flag.
>>     page = compound_head(page);
>>
>> And as far as the failure message, if one memory block is in use, it
>> will return failure when offline it.
>
> That's good, but aren't we still left with a situation where we've
> offlined and dissolved the _middle_ of a gigantic huge page while the
> head page is still in place and online?
>
> That seems bad.
>
What about refusing to change the status for such memory block, if it
contains a huge page which larger than itself? (function
memory_block_action())

I think it will not affect the hot-plug function too much. We can
change the nr_hugepages to zero first, if we really want to hot-plug a
memory.

And I also found that the __test_page_isolated_in_pageblock() function
can not handle a gigantic page well. It will cause a device busy error
later. I am still investigating on that.

Any suggestion?

Thanks!

  reply	other threads:[~2016-09-20 15:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-13  8:39 [PATCH] memory-hotplug: Fix bad area access on dissolve_free_huge_pages() Rui Teng
2016-09-13 17:32 ` Dave Hansen
2016-09-14 16:33   ` Rui Teng
2016-09-14 16:37     ` Dave Hansen
2016-09-16 13:58       ` Rui Teng
2016-09-16 16:25         ` Dave Hansen
2016-09-20 14:45           ` Rui Teng
2016-09-20 14:53             ` Dave Hansen
2016-09-20 15:52               ` Rui Teng [this message]
2016-09-20 17:43                 ` Dave Hansen
2016-09-21 12:05                   ` Michal Hocko
2016-09-21 16:04                     ` Dave Hansen
2016-09-21 16:27                       ` Michal Hocko
2016-09-21 16:32                         ` Dave Hansen
2016-09-21 16:52                           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc05ee3c-097f-709b-7484-1cadc9f3ce22@linux.vnet.ibm.com \
    --to=rui.teng@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=santhog4@in.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).