From: toshi.kani@hpe.com (Kani, Toshi)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 2/2] x86/mm: implement free pmd/pte page interfaces
Date: Fri, 27 Apr 2018 14:31:51 +0000 [thread overview]
Message-ID: <1524839460.2693.531.camel@hpe.com> (raw)
In-Reply-To: <20180427073719.GT15462@8bytes.org>
On Fri, 2018-04-27 at 09:37 +0200, joro at 8bytes.org wrote:
> On Thu, Apr 26, 2018 at 10:30:14PM +0000, Kani, Toshi wrote:
> > Thanks for the clarification. After reading through SDM one more time, I
> > agree that we need a TLB purge here. Here is my current understanding.
> >
> > - INVLPG purges both TLB and paging-structure caches. So, PMD cache was
> > purged once.
> > - However, processor may cache this PMD entry later in speculation
> > since it has p-bit set. (This is where my misunderstanding was.
> > Speculation is not allowed to access a target address, but it may still
> > cache this PMD entry.)
> > - A single INVLPG on each processor purges this PMD cache. It does not
> > need a range purge (which was already done).
> >
> > Does it sound right to you?
>
> The right fix is to first synchronize the changes when the PMD/PUD is
> cleared and then flush the TLB system-wide. After that is done you can
> free the page.
Agreed. This can be done on top of this patch.
> But doing all that in the pud/pmd_free_pmd/pte_page() functions is too
> expensive, as the TLB flush requires to send IPIs to all cores in the
> system, and that every time the function is called.
>
> So what needs to be done is to fix this from high-level ioremap code to
> first unmap all required PTE/PMD pages and collect them in a list. When
> that is done you can synchronize the changes with the other page-tables
> in the system and do one system-wide TLB flush. When that is complete
> you can free the pages on the list that were collected while unmapping.
>
> Then the new mappings can be established and again synchronized with the
> other page-tables in the system.
Yes, and this patch was designed to work in such way. Please note that
this patch added pud_free_pmd_page() and pmd_free_pte_page() to the
ioremap() path when and only when it creates a pud or pmd map. This
assures the following preconditions are met without overhead.
- All pte entries for a target pud/pmd address range have been cleared.
- System-wide TLB purges have been done for a target pud/pmd address
range.
So, we can add the step 2 on top of this patch.
1. Clear pud/pmd entry.
2. System wide TLB flush <-- TO BE ADDED BY NEW PATCH
3. Free its underlining pmd/pte page.
> > As for the BUG_ON issue, are you able to reproduce this issue? If so,
> > would you be able to test the fix?
>
> Yes, I can reproduce the BUG_ON with my PTI patches and a fedora-i386
> VM.
Great!
> I already ran into the issue before your patches were merged upstream,
> but my "fix" is different because it just prevents huge-mappings when
> there were smaller mappings before. See
>
> e3e288121408 x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
>
> for details. This patch does not fix the base-problem, but hides it
> again, as the real fix needs some more work across architectures.
Right. Patch 1/2 of this series made the same fix as well. See:
b6bdb7517c3d mm/vmalloc: add interfaces to free unmapped page table
> Your patch actually makes the problem worse, without it the PTE/PMD pages
> were just leaked, so that they could not be reused. But with your patch
> the pages can be used again and the page-walker might establish TLB
> entries based on random content the new owner writes to it. This can
> lead to all kinds of random and very hard to debug data corruption
> issues.
>
> So until we make the generic ioremap code in lib/ioremap.c smarter about
> unmapping/remapping ranges the best solution is making my fix work again
> by reverting your patch.
We do not need to revert this patch. We can make the above change I
mentioned.
Thanks,
-Toshi
next prev parent reply other threads:[~2018-04-27 14:31 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-03-14 18:01 [PATCH v2 0/2] fix memory leak / panic in ioremap huge pages Toshi Kani
2018-03-14 18:01 ` [PATCH v2 1/2] mm/vmalloc: Add interfaces to free unmapped page table Toshi Kani
2018-03-14 22:38 ` Andrew Morton
2018-03-15 14:27 ` Kani, Toshi
2018-03-14 18:01 ` [PATCH v2 2/2] x86/mm: implement free pmd/pte page interfaces Toshi Kani
2018-03-15 7:39 ` Chintan Pandya
2018-03-15 14:51 ` Kani, Toshi
2018-04-26 14:19 ` Joerg Roedel
2018-04-26 16:21 ` Kani, Toshi
2018-04-26 17:23 ` joro at 8bytes.org
2018-04-26 17:49 ` Kani, Toshi
2018-04-26 20:07 ` joro at 8bytes.org
2018-04-26 22:30 ` Kani, Toshi
2018-04-27 7:37 ` joro at 8bytes.org
2018-04-27 11:39 ` Michal Hocko
2018-04-27 11:46 ` joro at 8bytes.org
2018-04-27 11:52 ` Chintan Pandya
2018-04-27 12:48 ` joro at 8bytes.org
2018-04-27 13:42 ` Chintan Pandya
2018-04-27 14:31 ` Kani, Toshi [this message]
2018-04-28 9:02 ` joro at 8bytes.org
2018-04-28 20:54 ` Kani, Toshi
2018-04-30 7:30 ` Chintan Pandya
2018-04-30 13:43 ` Kani, Toshi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1524839460.2693.531.camel@hpe.com \
--to=toshi.kani@hpe.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).