* [PATCH v3 1/3] x86/mm: disable ioremap free page handling on x86-PAE
[not found] <20180516233207.1580-1-toshi.kani@hpe.com>
@ 2018-05-16 23:32 ` Toshi Kani
2018-05-16 23:32 ` [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr Toshi Kani
2018-05-16 23:32 ` [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces Toshi Kani
2 siblings, 0 replies; 9+ messages in thread
From: Toshi Kani @ 2018-05-16 23:32 UTC (permalink / raw)
To: mhocko, akpm, tglx, mingo, hpa
Cc: cpandya, linux-mm, x86, linux-arm-kernel, linux-kernel,
Toshi Kani, Joerg Roedel, stable
ioremap() supports pmd mappings on x86-PAE. However, kernel's pmd
tables are not shared among processes on x86-PAE. Therefore, any
update to sync'd pmd entries need re-syncing. Freeing a pte page
also leads to a vmalloc fault and hits the BUG_ON in vmalloc_sync_one().
Disable free page handling on x86-PAE. pud_free_pmd_page() and
pmd_free_pte_page() simply return 0 if a given pud/pmd entry is present.
This assures that ioremap() does not update sync'd pmd entries at the
cost of falling back to pte mappings.
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Reported-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: <stable@vger.kernel.org>
---
arch/x86/mm/pgtable.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index ffc8c13c50e4..3f7180bc5f52 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -715,6 +715,7 @@ int pmd_clear_huge(pmd_t *pmd)
return 0;
}
+#ifdef CONFIG_X86_64
/**
* pud_free_pmd_page - Clear pud entry and free pmd page.
* @pud: Pointer to a PUD.
@@ -762,4 +763,22 @@ int pmd_free_pte_page(pmd_t *pmd)
return 1;
}
+
+#else /* !CONFIG_X86_64 */
+
+int pud_free_pmd_page(pud_t *pud)
+{
+ return pud_none(*pud);
+}
+
+/*
+ * Disable free page handling on x86-PAE. This assures that ioremap()
+ * does not update sync'd pmd entries. See vmalloc_sync_one().
+ */
+int pmd_free_pte_page(pmd_t *pmd)
+{
+ return pmd_none(*pmd);
+}
+
+#endif /* CONFIG_X86_64 */
#endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr
[not found] <20180516233207.1580-1-toshi.kani@hpe.com>
2018-05-16 23:32 ` [PATCH v3 1/3] x86/mm: disable ioremap free page handling on x86-PAE Toshi Kani
@ 2018-05-16 23:32 ` Toshi Kani
2018-05-17 6:47 ` Michal Hocko
2018-05-16 23:32 ` [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces Toshi Kani
2 siblings, 1 reply; 9+ messages in thread
From: Toshi Kani @ 2018-05-16 23:32 UTC (permalink / raw)
To: mhocko, akpm, tglx, mingo, hpa
Cc: cpandya, linux-mm, x86, linux-arm-kernel, linux-kernel,
Toshi Kani, stable
From: Chintan Pandya <cpandya@codeaurora.org>
This patch ("mm/vmalloc: Add interfaces to free unmapped
page table") adds following 2 interfaces to free the page
table in case we implement huge mapping.
pud_free_pmd_page() and pmd_free_pte_page()
Some architectures (like arm64) needs to do proper TLB
maintanance after updating pagetable entry even in map.
Why ? Read this,
https://patchwork.kernel.org/patch/10134581/
Pass 'addr' in these interfaces so that proper TLB ops
can be performed.
[toshi@hpe.com: merge changes]
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: <stable@vger.kernel.org>
---
arch/arm64/mm/mmu.c | 4 ++--
arch/x86/mm/pgtable.c | 12 +++++++-----
include/asm-generic/pgtable.h | 8 ++++----
lib/ioremap.c | 4 ++--
4 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 2dbb2c9f1ec1..da98828609a1 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -973,12 +973,12 @@ int pmd_clear_huge(pmd_t *pmdp)
return 1;
}
-int pud_free_pmd_page(pud_t *pud)
+int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
return pud_none(*pud);
}
-int pmd_free_pte_page(pmd_t *pmd)
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
return pmd_none(*pmd);
}
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 3f7180bc5f52..f60fdf411103 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -719,11 +719,12 @@ int pmd_clear_huge(pmd_t *pmd)
/**
* pud_free_pmd_page - Clear pud entry and free pmd page.
* @pud: Pointer to a PUD.
+ * @addr: Virtual address associated with pud.
*
* Context: The pud range has been unmaped and TLB purged.
* Return: 1 if clearing the entry succeeded. 0 otherwise.
*/
-int pud_free_pmd_page(pud_t *pud)
+int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
pmd_t *pmd;
int i;
@@ -734,7 +735,7 @@ int pud_free_pmd_page(pud_t *pud)
pmd = (pmd_t *)pud_page_vaddr(*pud);
for (i = 0; i < PTRS_PER_PMD; i++)
- if (!pmd_free_pte_page(&pmd[i]))
+ if (!pmd_free_pte_page(&pmd[i], addr + (i * PMD_SIZE)))
return 0;
pud_clear(pud);
@@ -746,11 +747,12 @@ int pud_free_pmd_page(pud_t *pud)
/**
* pmd_free_pte_page - Clear pmd entry and free pte page.
* @pmd: Pointer to a PMD.
+ * @addr: Virtual address associated with pmd.
*
* Context: The pmd range has been unmaped and TLB purged.
* Return: 1 if clearing the entry succeeded. 0 otherwise.
*/
-int pmd_free_pte_page(pmd_t *pmd)
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
pte_t *pte;
@@ -766,7 +768,7 @@ int pmd_free_pte_page(pmd_t *pmd)
#else /* !CONFIG_X86_64 */
-int pud_free_pmd_page(pud_t *pud)
+int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
return pud_none(*pud);
}
@@ -775,7 +777,7 @@ int pud_free_pmd_page(pud_t *pud)
* Disable free page handling on x86-PAE. This assures that ioremap()
* does not update sync'd pmd entries. See vmalloc_sync_one().
*/
-int pmd_free_pte_page(pmd_t *pmd)
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
return pmd_none(*pmd);
}
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index f59639afaa39..b081794ba135 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1019,8 +1019,8 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot);
int pmd_set_huge(pmd_t *pmd, phys_addr_t addr, pgprot_t prot);
int pud_clear_huge(pud_t *pud);
int pmd_clear_huge(pmd_t *pmd);
-int pud_free_pmd_page(pud_t *pud);
-int pmd_free_pte_page(pmd_t *pmd);
+int pud_free_pmd_page(pud_t *pud, unsigned long addr);
+int pmd_free_pte_page(pmd_t *pmd, unsigned long addr);
#else /* !CONFIG_HAVE_ARCH_HUGE_VMAP */
static inline int p4d_set_huge(p4d_t *p4d, phys_addr_t addr, pgprot_t prot)
{
@@ -1046,11 +1046,11 @@ static inline int pmd_clear_huge(pmd_t *pmd)
{
return 0;
}
-static inline int pud_free_pmd_page(pud_t *pud)
+static inline int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
return 0;
}
-static inline int pmd_free_pte_page(pmd_t *pmd)
+static inline int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
{
return 0;
}
diff --git a/lib/ioremap.c b/lib/ioremap.c
index 54e5bbaa3200..517f5853ffed 100644
--- a/lib/ioremap.c
+++ b/lib/ioremap.c
@@ -92,7 +92,7 @@ static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr,
if (ioremap_pmd_enabled() &&
((next - addr) == PMD_SIZE) &&
IS_ALIGNED(phys_addr + addr, PMD_SIZE) &&
- pmd_free_pte_page(pmd)) {
+ pmd_free_pte_page(pmd, addr)) {
if (pmd_set_huge(pmd, phys_addr + addr, prot))
continue;
}
@@ -119,7 +119,7 @@ static inline int ioremap_pud_range(p4d_t *p4d, unsigned long addr,
if (ioremap_pud_enabled() &&
((next - addr) == PUD_SIZE) &&
IS_ALIGNED(phys_addr + addr, PUD_SIZE) &&
- pud_free_pmd_page(pud)) {
+ pud_free_pmd_page(pud, addr)) {
if (pud_set_huge(pud, phys_addr + addr, prot))
continue;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces
[not found] <20180516233207.1580-1-toshi.kani@hpe.com>
2018-05-16 23:32 ` [PATCH v3 1/3] x86/mm: disable ioremap free page handling on x86-PAE Toshi Kani
2018-05-16 23:32 ` [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr Toshi Kani
@ 2018-05-16 23:32 ` Toshi Kani
2018-05-29 14:44 ` Joerg Roedel
2 siblings, 1 reply; 9+ messages in thread
From: Toshi Kani @ 2018-05-16 23:32 UTC (permalink / raw)
To: mhocko, akpm, tglx, mingo, hpa
Cc: cpandya, linux-mm, x86, linux-arm-kernel, linux-kernel,
Toshi Kani, Joerg Roedel, stable
ioremap() calls pud_free_pmd_page() / pmd_free_pte_page() when it creates
a pud / pmd map. The following preconditions are met at their entry.
- All pte entries for a target pud/pmd address range have been cleared.
- System-wide TLB purges have been peformed for a target pud/pmd address
range.
The preconditions assure that there is no stale TLB entry for the range.
Speculation may not cache TLB entries since it requires all levels of page
entries, including ptes, to have P & A-bits set for an associated address.
However, speculation may cache pud/pmd entries (paging-structure caches)
when they have P-bit set.
Add a system-wide TLB purge (INVLPG) to a single page after clearing
pud/pmd entry's P-bit.
SDM 4.10.4.1, Operation that Invalidate TLBs and Paging-Structure Caches,
states that:
INVLPG invalidates all paging-structure caches associated with the
current PCID regardless of the liner addresses to which they correspond.
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: <stable@vger.kernel.org>
---
arch/x86/mm/pgtable.c | 34 ++++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index f60fdf411103..7e96594c7e97 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -721,24 +721,42 @@ int pmd_clear_huge(pmd_t *pmd)
* @pud: Pointer to a PUD.
* @addr: Virtual address associated with pud.
*
- * Context: The pud range has been unmaped and TLB purged.
+ * Context: The pud range has been unmapped and TLB purged.
* Return: 1 if clearing the entry succeeded. 0 otherwise.
*/
int pud_free_pmd_page(pud_t *pud, unsigned long addr)
{
- pmd_t *pmd;
+ pmd_t *pmd, *pmd_sv;
+ pte_t *pte;
int i;
if (pud_none(*pud))
return 1;
pmd = (pmd_t *)pud_page_vaddr(*pud);
+ pmd_sv = (pmd_t *)__get_free_page(GFP_KERNEL);
+ if (!pmd_sv)
+ return 0;
- for (i = 0; i < PTRS_PER_PMD; i++)
- if (!pmd_free_pte_page(&pmd[i], addr + (i * PMD_SIZE)))
- return 0;
+ for (i = 0; i < PTRS_PER_PMD; i++) {
+ pmd_sv[i] = pmd[i];
+ if (!pmd_none(pmd[i]))
+ pmd_clear(&pmd[i]);
+ }
pud_clear(pud);
+
+ /* INVLPG to clear all paging-structure caches */
+ flush_tlb_kernel_range(addr, addr + PAGE_SIZE-1);
+
+ for (i = 0; i < PTRS_PER_PMD; i++) {
+ if (!pmd_none(pmd_sv[i])) {
+ pte = (pte_t *)pmd_page_vaddr(pmd_sv[i]);
+ free_page((unsigned long)pte);
+ }
+ }
+
+ free_page((unsigned long)pmd_sv);
free_page((unsigned long)pmd);
return 1;
@@ -749,7 +767,7 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr)
* @pmd: Pointer to a PMD.
* @addr: Virtual address associated with pmd.
*
- * Context: The pmd range has been unmaped and TLB purged.
+ * Context: The pmd range has been unmapped and TLB purged.
* Return: 1 if clearing the entry succeeded. 0 otherwise.
*/
int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
@@ -761,6 +779,10 @@ int pmd_free_pte_page(pmd_t *pmd, unsigned long addr)
pte = (pte_t *)pmd_page_vaddr(*pmd);
pmd_clear(pmd);
+
+ /* INVLPG to clear all paging-structure caches */
+ flush_tlb_kernel_range(addr, addr + PAGE_SIZE-1);
+
free_page((unsigned long)pte);
return 1;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr
2018-05-16 23:32 ` [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr Toshi Kani
@ 2018-05-17 6:47 ` Michal Hocko
2018-05-17 14:32 ` Kani, Toshi
0 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2018-05-17 6:47 UTC (permalink / raw)
To: Toshi Kani
Cc: akpm, tglx, mingo, hpa, cpandya, linux-mm, x86, linux-arm-kernel,
linux-kernel, stable
On Wed 16-05-18 17:32:06, Kani Toshimitsu wrote:
> From: Chintan Pandya <cpandya@codeaurora.org>
>
> This patch ("mm/vmalloc: Add interfaces to free unmapped
> page table") adds following 2 interfaces to free the page
> table in case we implement huge mapping.
>
> pud_free_pmd_page() and pmd_free_pte_page()
>
> Some architectures (like arm64) needs to do proper TLB
> maintanance after updating pagetable entry even in map.
> Why ? Read this,
> https://patchwork.kernel.org/patch/10134581/
Please add that information to the changelog.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr
2018-05-17 6:47 ` Michal Hocko
@ 2018-05-17 14:32 ` Kani, Toshi
0 siblings, 0 replies; 9+ messages in thread
From: Kani, Toshi @ 2018-05-17 14:32 UTC (permalink / raw)
To: mhocko@kernel.org
Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de,
linux-mm@kvack.org, stable@vger.kernel.org, x86@kernel.org,
akpm@linux-foundation.org, hpa@zytor.com, mingo@redhat.com,
linux-arm-kernel@lists.infradead.org, cpandya@codeaurora.org
On Thu, 2018-05-17 at 08:47 +0200, Michal Hocko wrote:
> On Wed 16-05-18 17:32:06, Kani Toshimitsu wrote:
> > From: Chintan Pandya <cpandya@codeaurora.org>
> >
> > This patch ("mm/vmalloc: Add interfaces to free unmapped
> > page table") adds following 2 interfaces to free the page
> > table in case we implement huge mapping.
> >
> > pud_free_pmd_page() and pmd_free_pte_page()
> >
> > Some architectures (like arm64) needs to do proper TLB
> > maintanance after updating pagetable entry even in map.
> > Why ? Read this,
> > https://patchwork.kernel.org/patch/10134581/
>
> Please add that information to the changelog.
I will update the description and resend this patch.
Thanks!
-Toshi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces
2018-05-16 23:32 ` [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces Toshi Kani
@ 2018-05-29 14:44 ` Joerg Roedel
2018-05-29 16:10 ` Kani, Toshi
0 siblings, 1 reply; 9+ messages in thread
From: Joerg Roedel @ 2018-05-29 14:44 UTC (permalink / raw)
To: mingo, tglx, Toshi Kani
Cc: mhocko, akpm, tglx, mingo, hpa, cpandya, linux-mm, x86,
linux-arm-kernel, linux-kernel, stable
On Wed, May 16, 2018 at 05:32:07PM -0600, Toshi Kani wrote:
> pmd = (pmd_t *)pud_page_vaddr(*pud);
> + pmd_sv = (pmd_t *)__get_free_page(GFP_KERNEL);
> + if (!pmd_sv)
> + return 0;
So your code still needs to allocate a full page where a simple
list_head on the stack would do the same job.
Ingo, Thomas, can you please just revert the original broken patch for
now until there is proper fix?
Thanks,
Joerg
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces
2018-05-29 14:44 ` Joerg Roedel
@ 2018-05-29 16:10 ` Kani, Toshi
2018-05-30 4:59 ` joro
0 siblings, 1 reply; 9+ messages in thread
From: Kani, Toshi @ 2018-05-29 16:10 UTC (permalink / raw)
To: tglx@linutronix.de, joro@8bytes.org, mingo@redhat.com
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org, x86@kernel.org, akpm@linux-foundation.org,
hpa@zytor.com, linux-arm-kernel@lists.infradead.org,
cpandya@codeaurora.org, Hocko, Michal
On Tue, 2018-05-29 at 16:44 +0200, Joerg Roedel wrote:
> On Wed, May 16, 2018 at 05:32:07PM -0600, Toshi Kani wrote:
> > pmd = (pmd_t *)pud_page_vaddr(*pud);
> > + pmd_sv = (pmd_t *)__get_free_page(GFP_KERNEL);
> > + if (!pmd_sv)
> > + return 0;
>
> So your code still needs to allocate a full page where a simple
> list_head on the stack would do the same job.
Can you explain why you think allocating a page here is a major problem?
As I explained before, pud_free_pmd_page() covers an extremely rare case
which I could not even hit with a huge number of ioremap() calls until
I instrumented alloc_vmap_area() to force this case to happen. I do not
think pages should be listed for such a rare case.
> Ingo, Thomas, can you please just revert the original broken patch for
> now until there is proper fix?
If we just revert, please apply patch 1/3 first. This patch address the
BUG_ON issue on PAE. This is a real issue that needs a fix ASAP.
The page-directory cache issue on x64, which is addressed by patch 3/3,
is a theoretical issue that I could not hit by putting ioremap() calls
into a loop for a whole day. Nobody hit this issue, either.
The simple revert patch Joerg posted a while ago causes
pmd_free_pte_page() to fail on x64. This causes multiple pmd mappings
to fall into pte mappings on my test systems. This can be seen as a
degradation, and I am afraid that it is more harmful than good.
Thanks,
-Toshi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces
2018-05-29 16:10 ` Kani, Toshi
@ 2018-05-30 4:59 ` joro
2018-05-30 15:39 ` Kani, Toshi
0 siblings, 1 reply; 9+ messages in thread
From: joro @ 2018-05-30 4:59 UTC (permalink / raw)
To: Kani, Toshi
Cc: tglx@linutronix.de, mingo@redhat.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
stable@vger.kernel.org, x86@kernel.org, akpm@linux-foundation.org,
hpa@zytor.com, linux-arm-kernel@lists.infradead.org,
cpandya@codeaurora.org, Hocko, Michal
On Tue, May 29, 2018 at 04:10:24PM +0000, Kani, Toshi wrote:
> Can you explain why you think allocating a page here is a major problem?
Because a larger allocation is more likely to fail. And if you fail the
allocation, you also fail to free more pages, which _is_ a problem. So
better avoid any allocations in code paths that are about freeing
memory.
> If we just revert, please apply patch 1/3 first. This patch address the
> BUG_ON issue on PAE. This is a real issue that needs a fix ASAP.
It does not address the problem of dirty page-walk caches on x86-64.
> The page-directory cache issue on x64, which is addressed by patch 3/3,
> is a theoretical issue that I could not hit by putting ioremap() calls
> into a loop for a whole day. Nobody hit this issue, either.
How do you know you didn't hit that issue? It might cause silent data
corruption, which might not be easily detected.
> The simple revert patch Joerg posted a while ago causes
> pmd_free_pte_page() to fail on x64. This causes multiple pmd mappings
> to fall into pte mappings on my test systems. This can be seen as a
> degradation, and I am afraid that it is more harmful than good.
The plain revert just removes all the issues with the dirty TLB that the
original patch introduced and prevents huge mappings from being
established when there have been smaller mappings before. This is not
ideal, but at least its is consistent and does not leak pages and leaves
no dirty TLBs. So this is the easiest and most reliable fix for this
stage in the release process.
Regards,
Joerg
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces
2018-05-30 4:59 ` joro
@ 2018-05-30 15:39 ` Kani, Toshi
0 siblings, 0 replies; 9+ messages in thread
From: Kani, Toshi @ 2018-05-30 15:39 UTC (permalink / raw)
To: joro@8bytes.org
Cc: linux-kernel@vger.kernel.org, tglx@linutronix.de,
linux-mm@kvack.org, stable@vger.kernel.org, x86@kernel.org,
akpm@linux-foundation.org, hpa@zytor.com, mingo@redhat.com,
linux-arm-kernel@lists.infradead.org, cpandya@codeaurora.org,
Hocko, Michal
On Wed, 2018-05-30 at 06:59 +0200, joro@8bytes.org wrote:
> On Tue, May 29, 2018 at 04:10:24PM +0000, Kani, Toshi wrote:
> > Can you explain why you think allocating a page here is a major problem?
>
> Because a larger allocation is more likely to fail. And if you fail the
> allocation, you also fail to free more pages, which _is_ a problem. So
> better avoid any allocations in code paths that are about freeing
> memory.
It only allocates a single page. If it fails to allocate a single page,
this pud mapping request simply falls to pmd mappings, which is only as
bad as your suggested plain revert always does for both pud and pmd
mappings. Also, this func is called from ioremap() when a driver
initializes its mapping. If the system does not have a single page to
allocate, the driver has a much bigger issue to deal with than its
request falling back to pmd mappings. Please also note that this func
is not called from free-memory path.
> > If we just revert, please apply patch 1/3 first. This patch address the
> > BUG_ON issue on PAE. This is a real issue that needs a fix ASAP.
>
> It does not address the problem of dirty page-walk caches on x86-64.
This patch 3/3 fixes it. Hope my explanation above clarified.
> > The page-directory cache issue on x64, which is addressed by patch 3/3,
> > is a theoretical issue that I could not hit by putting ioremap() calls
> > into a loop for a whole day. Nobody hit this issue, either.
>
> How do you know you didn't hit that issue? It might cause silent data
> corruption, which might not be easily detected.
If the test hit that issue, it would have caused a kernel page fault
(freed & cleared pte page still unmapped, or this page reused and
attribute data invalid) or MCE (pte page reused and phys addr invalid)
when it accessed to ioremap'd address.
Causing silent data corruption requires that this freed & cleared pte
page to be reused and re-initialized with a valid pte entry data. While
this case is possible, the chance of my test only hitting this case
without ever hitting much more likely cases of page fault or MCE is
basically zero.
> > The simple revert patch Joerg posted a while ago causes
> > pmd_free_pte_page() to fail on x64. This causes multiple pmd mappings
> > to fall into pte mappings on my test systems. This can be seen as a
> > degradation, and I am afraid that it is more harmful than good.
>
> The plain revert just removes all the issues with the dirty TLB that the
> original patch introduced and prevents huge mappings from being
> established when there have been smaller mappings before. This is not
> ideal, but at least its is consistent and does not leak pages and leaves
> no dirty TLBs. So this is the easiest and most reliable fix for this
> stage in the release process.
If you think the page directory issue needs a fix ASAP, I believe taking
patch 3/3 is much better option than the plain revert, which will
introduce the fall-back issue I explained.
Thanks,
-Toshi
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-05-30 15:39 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20180516233207.1580-1-toshi.kani@hpe.com>
2018-05-16 23:32 ` [PATCH v3 1/3] x86/mm: disable ioremap free page handling on x86-PAE Toshi Kani
2018-05-16 23:32 ` [PATCH v3 2/3] ioremap: Update pgtable free interfaces with addr Toshi Kani
2018-05-17 6:47 ` Michal Hocko
2018-05-17 14:32 ` Kani, Toshi
2018-05-16 23:32 ` [PATCH v3 3/3] x86/mm: add TLB purge to free pmd/pte page interfaces Toshi Kani
2018-05-29 14:44 ` Joerg Roedel
2018-05-29 16:10 ` Kani, Toshi
2018-05-30 4:59 ` joro
2018-05-30 15:39 ` Kani, Toshi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).