From: Mike Kravetz <mike.kravetz@oracle.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
Anshuman Khandual <khandual@linux.vnet.ibm.com>,
Andrew Morton <akpm@linux-foundation.org>,
"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: [PATCH 1/1] mm:hugetlbfs: Fix hwpoison reserve accounting
Date: Mon, 23 Oct 2017 11:20:02 -0700 [thread overview]
Message-ID: <26945734-ac7e-f71e-dbfa-0b0f0fdaff32@oracle.com> (raw)
In-Reply-To: <20171023073258.GA5115@hori1.linux.bs1.fc.nec.co.jp>
On 10/23/2017 12:32 AM, Naoya Horiguchi wrote:
> On Fri, Oct 20, 2017 at 10:49:46AM -0700, Mike Kravetz wrote:
>> On 10/19/2017 07:30 PM, Naoya Horiguchi wrote:
>>> On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote:
>>>
>>> Thank you for addressing this. The patch itself looks good to me, but
>>> the reported issue (negative reserve count) doesn't reproduce in my trial
>>> with v4.14-rc5, so could you share the exact procedure for this issue?
>>
>> Sure, but first one question on your test scenario below.
>>
>>>
>>> When error handler runs over a huge page, the reserve count is incremented
>>> so I'm not sure why the reserve count goes negative.
>>
>> I'm not sure I follow. What specific code is incrementing the reserve
>> count?
>
> The call path is like below:
>
> hugetlbfs_error_remove_page
> hugetlb_fix_reserve_counts
> hugepage_subpool_get_pages(spool, 1)
> hugetlb_acct_memory(h, 1);
> gather_surplus_pages
> h->resv_huge_pages += delta;
>
Ah OK. This is a result of call to hugetlb_fix_reserve_counts which
I believe is incorrect in most instances, and is unlikely to happen
with my patch.
>>
>> Remove the file (rm /var/opt/oracle/hugepool/foo)
>> -------------------------------------------------
>> HugePages_Total: 1
>> HugePages_Free: 0
>> HugePages_Rsvd: 18446744073709551615
>> HugePages_Surp: 0
>> Hugepagesize: 2048 kB
>>
>> I am still confused about how your test maintains a reserve count after
>> poisoning. It may be a good idea for you to test my patch with your
>> test scenario as I can not recreate here.
>
> Interestingly, I found that this reproduces if all hugetlb pages are
> reserved when poisoning.
> Your testing meets the condition, and mine doesn't.
>
> In gather_surplus_pages() we determine whether we extend hugetlb pool
> with surplus pages like below:
>
> needed = (h->resv_huge_pages + delta) - h->free_huge_pages;
> if (needed <= 0) {
> h->resv_huge_pages += delta;
> return 0;
> }
> ...
>
> needed is 1 if h->resv_huge_pages == h->free_huge_pages, and then
> the reserve count gets inconsistent.
> I confirmed that your patch fixes the issue, so I'm OK with it.
Thanks. That now makes sense to me.
hugetlb_fix_reserve_counts (which results in gather_surplus_pages being
called), is only designed to be called in the extremely rare cases when
we have free'ed a huge page but are unable to free the reservation entry.
Just curious, when the hugetlb_fix_reserve_counts call was added to
hugetlbfs_error_remove_page, was the intention to preserve the original
reservation? I remember thinking hard about that for the hole punch
case and came to the conclusion that it was easier and less error prone
to remove the reservation as well. That will also happen in the error
case with the patch I provided.
--
Mike Kravetz
>
> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>
> Thanks,
> Naoya Horiguchi
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-10-23 18:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-19 23:00 [PATCH 0/1] mm:hugetlbfs: Fix hwpoison reserve accounting Mike Kravetz
2017-10-19 23:00 ` [PATCH 1/1] " Mike Kravetz
2017-10-20 2:30 ` Naoya Horiguchi
2017-10-20 17:49 ` Mike Kravetz
2017-10-23 7:32 ` Naoya Horiguchi
2017-10-23 18:20 ` Mike Kravetz [this message]
2017-10-24 0:46 ` Naoya Horiguchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=26945734-ac7e-f71e-dbfa-0b0f0fdaff32@oracle.com \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=khandual@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).