From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Paul Mackerras <paulus@ozlabs.org>, linuxppc-dev@lists.ozlabs.org
Cc: linux-mm@kvack.org
Subject: Re: Problems with THP in v4.5-rc4 on POWER
Date: Sat, 20 Feb 2016 20:00:42 +0530 [thread overview]
Message-ID: <878u2fe9nx.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20160220055419.GB16191@fergus.ozlabs.ibm.com>
Paul Mackerras <paulus@ozlabs.org> writes:
> On Sat, Feb 20, 2016 at 12:39:42PM +1100, Paul Mackerras wrote:
>> It seems there's something wrong with our transparent hugepage
>> implementation on POWER server processors as of v4.5-rc4. I have seen
>> the email thread on "[BUG] random kernel crashes after THP rework on
>> s390 (maybe also on PowerPC and ARM)", but this doesn't seem exactly
>> the same as that (though it may of course be related).
>>
>> I have been testing v4.5-rc4 with Aneesh's patch "powerpc/mm/hash:
>> Clear the invalid slot information correctly" on top, on a KVM guest
>> with 160 vcpus (threads=8) and 32GB of memory backed by 16MB large
>> pages, running on a POWER8 machine running a 4.4.1 host kernel (20
>> cores * 8 threads, 128GB of RAM). The guest kernel is compiled with
>> THP enabled and set to "always" (i.e. not "madvise").
>>
>> On this setup, when doing something like a large kernel compile, I see
>> random segfaults happening (in gcc, cc1, sh, etc.). I also see bursts
>> of messages like this on the host console:
>>
>> [50957.570859] Harmless Hypervisor Maintenance interrupt [Recovered]
>> [50957.570864] Error detail: Processor Recovery done
>> [50957.570869] HMER: 2040000000000000
>
> When I use a merge of v4.5-rc4 with the fixes branch from the powerpc
> tree, I don't see these messages any more, presumably due to
> "powerpc/mm: Fix Multi hit ERAT cause by recent THP update". With my
> patch, I still see that it is finding HPTEs to invalidate, but without
> my patch, even though it is presumably leaving HPTEs around, I don't
> see any errors (such as random segfaults) occurring.
>
Yes that patch should fix that. With thp, we use the deposited page
table to store hash pte slot information. On splitting, we were doing a
withdraw of deposited table, without doing the required hash pte flush.
This fix is to call pmdp_huge_split_prepare, which does the required
flush and also clear the _PAGE_USER. Clearning _PAGE_USER ensure that we
don't be inserting any other hash pte entries w.r.t to this THP.
-aneesh
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Paul Mackerras <paulus@ozlabs.org>, linuxppc-dev@lists.ozlabs.org
Cc: linux-mm@kvack.org
Subject: Re: Problems with THP in v4.5-rc4 on POWER
Date: Sat, 20 Feb 2016 20:00:42 +0530 [thread overview]
Message-ID: <878u2fe9nx.fsf@linux.vnet.ibm.com> (raw)
In-Reply-To: <20160220055419.GB16191@fergus.ozlabs.ibm.com>
Paul Mackerras <paulus@ozlabs.org> writes:
> On Sat, Feb 20, 2016 at 12:39:42PM +1100, Paul Mackerras wrote:
>> It seems there's something wrong with our transparent hugepage
>> implementation on POWER server processors as of v4.5-rc4. I have seen
>> the email thread on "[BUG] random kernel crashes after THP rework on
>> s390 (maybe also on PowerPC and ARM)", but this doesn't seem exactly
>> the same as that (though it may of course be related).
>>
>> I have been testing v4.5-rc4 with Aneesh's patch "powerpc/mm/hash:
>> Clear the invalid slot information correctly" on top, on a KVM guest
>> with 160 vcpus (threads=8) and 32GB of memory backed by 16MB large
>> pages, running on a POWER8 machine running a 4.4.1 host kernel (20
>> cores * 8 threads, 128GB of RAM). The guest kernel is compiled with
>> THP enabled and set to "always" (i.e. not "madvise").
>>
>> On this setup, when doing something like a large kernel compile, I see
>> random segfaults happening (in gcc, cc1, sh, etc.). I also see bursts
>> of messages like this on the host console:
>>
>> [50957.570859] Harmless Hypervisor Maintenance interrupt [Recovered]
>> [50957.570864] Error detail: Processor Recovery done
>> [50957.570869] HMER: 2040000000000000
>
> When I use a merge of v4.5-rc4 with the fixes branch from the powerpc
> tree, I don't see these messages any more, presumably due to
> "powerpc/mm: Fix Multi hit ERAT cause by recent THP update". With my
> patch, I still see that it is finding HPTEs to invalidate, but without
> my patch, even though it is presumably leaving HPTEs around, I don't
> see any errors (such as random segfaults) occurring.
>
Yes that patch should fix that. With thp, we use the deposited page
table to store hash pte slot information. On splitting, we were doing a
withdraw of deposited table, without doing the required hash pte flush.
This fix is to call pmdp_huge_split_prepare, which does the required
flush and also clear the _PAGE_USER. Clearning _PAGE_USER ensure that we
don't be inserting any other hash pte entries w.r.t to this THP.
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-02-20 14:30 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-20 1:39 Problems with THP in v4.5-rc4 on POWER Paul Mackerras
2016-02-20 1:39 ` Paul Mackerras
2016-02-20 5:54 ` Paul Mackerras
2016-02-20 5:54 ` Paul Mackerras
2016-02-20 14:30 ` Aneesh Kumar K.V [this message]
2016-02-20 14:30 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=878u2fe9nx.fsf@linux.vnet.ibm.com \
--to=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.