From: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
To: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: "linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
linux-mm@kvack.org, Vineet Gupta <vgupta@synopsys.com>,
mpe@ellerman.id.au, akpm@linux-foundation.org,
"linux-snps-arc@lists.infradead.org"
<linux-snps-arc@lists.infradead.org>,
linuxppc-dev@lists.ozlabs.org,
linux-riscv <linux-riscv@lists.infradead.org>,
Gerald Schaefer <gerald.schaefer@de.ibm.com>
Subject: Re: [PATCH v4 00/13] mm/debug_vm_pgtable fixes
Date: Tue, 8 Sep 2020 17:39:06 +0200 [thread overview]
Message-ID: <20200908173906.30fffaa0@thinkpad> (raw)
In-Reply-To: <20200904180115.07ee5f00@thinkpad>
On Fri, 4 Sep 2020 18:01:15 +0200
Gerald Schaefer <gerald.schaefer@linux.ibm.com> wrote:
[...]
>
> BTW2, a quick test with this change (so far) made the issues on s390
> go away:
>
> @@ -1069,7 +1074,7 @@ static int __init debug_vm_pgtable(void)
> spin_unlock(ptl);
>
> #ifndef CONFIG_PPC_BOOK3S_64
> - hugetlb_advanced_tests(mm, vma, ptep, pte_aligned, vaddr, prot);
> + hugetlb_advanced_tests(mm, vma, (pte_t *) pmdp, pmd_aligned, vaddr, prot);
> #endif
>
> spin_lock(&mm->page_table_lock);
>
> That would more match the "pte_t pointer" usage for hugetlb code,
> i.e. just cast a pmd_t pointer to it. Also changed to pmd_aligned,
> but I think the root cause is the pte_t pointer.
>
> Not entirely sure though if that would really be the correct fix.
> I somehow lost whatever little track I had about what these tests
> really want to check, and if that would still be valid with that
> change.
Uh oh, wasn't aware that this (or some predecessor) already went
upstream, and broke our debug kernel today.
I found out now what goes (horribly) wrong on s390, see below for
more details. In short, using hugetlb primitives with ptep pointers
that do _not_ point to a pmd or pud entry will not work on s390.
It also seems to make no sense to verify / test such a thing in general,
as it would also be a severe bug if any kernel code would do that.
After all, with hugepages, there are no pte tables, only pmd etc.
tables.
My change above would fix the issue for s390, but I can still not
completely judge if that would not break other things for your
tests. In general, for normal kernel code, much of what you do would
be very broken, but I guess your tests are doing such "special" things
because they can. E.g. because they operate on some "sandbox" mm
and page tables, and you also do not need properly populated page
tables for some exit / free cleanup, you just throw them away
explicitly with pXd_free at the end. So it might just be "the right
thing" to pass a casted pmd pointer to hugetlb_advanced_tests(),
to simulate and test (proper) usage of the hugetlb primitives.
I also see no other way to make this work for s390, than using a
proper pmd/pud pointer. If not possible, please add us to the
#ifndef.
So, for all those interested, here is what goes wrong on s390.
huge_ptep_get_and_clear() uses the "idte" instruction for the
clearing (and TLB invalidation) part. That instruction expects
a "region or segment table" origin, which is a pmd/pud/p4d/pgd,
but not a pte table. Even worse, when we calculate the table
origin from the given ptep (which *should* not point to a pte),
due to different table sizes for pte / pXd tables, we end up
at some place before the given pte table.
The "idte" instruction also gets the virtual address, and does
corresponding index addition to the given table origin. Depending
on the pmd_index we now end up either within the pte table again,
in which case we see a panic because idte complains about seeing
a pte value. If we are unlucky, then we end up outside the pte
table, and depending on the content of that memory location, idte
might succeed, effectively corrupting that memory.
That explains why we only see the panic sometimes, depending on
random vaddr, other symptoms other times, and probably completely
silent memory corruption for the rest...
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2020-09-08 16:29 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200902114222.181353-1-aneesh.kumar@linux.ibm.com>
2020-09-04 6:48 ` [PATCH v4 00/13] mm/debug_vm_pgtable fixes Anshuman Khandual
2020-09-04 15:26 ` Gerald Schaefer
2020-09-04 16:01 ` Gerald Schaefer
2020-09-04 17:53 ` Gerald Schaefer
2020-09-09 8:38 ` Anshuman Khandual
2020-09-08 15:39 ` Gerald Schaefer [this message]
2020-09-09 6:08 ` Aneesh Kumar K.V
2020-09-09 11:16 ` Gerald Schaefer
2020-09-09 8:15 ` Anshuman Khandual
2020-09-09 11:10 ` Gerald Schaefer
2020-09-09 8:08 ` Anshuman Khandual
2020-09-09 11:36 ` Gerald Schaefer
[not found] ` <20200902114222.181353-14-aneesh.kumar@linux.ibm.com>
[not found] ` <20200911021358.GA3656343@ubuntu-n2-xlarge-x86>
2020-09-11 5:21 ` [PATCH v4 13/13] mm/debug_vm_pgtable: Avoid none pte in pte_clear_test Aneesh Kumar K.V
2020-09-23 3:14 ` Anshuman Khandual
[not found] ` <20201011200258.GA91021@roeck-us.net>
2020-10-12 4:29 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200908173906.30fffaa0@thinkpad \
--to=gerald.schaefer@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=anshuman.khandual@arm.com \
--cc=gerald.schaefer@de.ibm.com \
--cc=linux-mm@kvack.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=vgupta@synopsys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).