From: Borislav Petkov <bp@alien8.de>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
hpa@zytor.com, tglx@linutronix.de, mingo@redhat.com,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, x86@kernel.org, jgross@suse.com,
konrad.wilk@oracle.com, elliott@hpe.com,
boris.ostrovsky@oracle.com, Toshi Kani <toshi.kani@hpe.com>
Subject: Re: [PATCH] x86/mm: fix regression with huge pages on PAE
Date: Wed, 11 Nov 2015 10:51:01 +0100 [thread overview]
Message-ID: <20151111095101.GA22512@pd.tnic> (raw)
In-Reply-To: <20151110170447.GH19187@pd.tnic>
On Tue, Nov 10, 2015 at 06:04:47PM +0100, Borislav Petkov wrote:
> On Tue, Nov 10, 2015 at 05:07:13PM +0200, Kirill A. Shutemov wrote:
> > Yeah. Looks good to me.
>
> It gets even cleaner. Let me run it through the *config build tests.
Hohumm, it passes. Here's what I ended up committing:
---
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 10 Nov 2015 01:18:10 +0200
Subject: [PATCH] x86/mm: Fix regression with huge pages on PAE
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Recent PAT patchset has caused issue on 32-bit PAE machines:
page:eea45000 count:0 mapcount:-128 mapping: (null) index:0x0
flags: 0x40000000()
page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) < 0)
------------[ cut here ]------------
kernel BUG at /home/build/linux-boris/mm/huge_memory.c:1485!
invalid opcode: 0000 [#1] SMP
Modules linked in: ahci libahci ata_generic skge r8169 firewire_ohci mii libata qla2xxx(+) scsi_transport_fc scsi_mod radeon tpm_infineon ttm backlight wmi acpi_cpufreq tpm_tis
CPU: 2 PID: 1758 Comm: modprobe Not tainted 4.3.0upstream-09269-gce5c2d2 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080014 07/18/2008
task: ed84e600 ti: f6458000 task.ti: f6458000
EIP: 0060:[<c11bde80>] EFLAGS: 00010246 CPU: 2
EIP is at zap_huge_pmd+0x240/0x260
EAX: 00000000 EBX: f6459eb0 ECX: 00000292 EDX: 00000292
ESI: f6634d98 EDI: eea45000 EBP: f6459dc8 ESP: f6459d98
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: b75b21a0 CR3: 3655b880 CR4: 000006f0
Stack:
...
Call Trace:
unmap_single_vma
? __wake_up
unmap_vmas
unmap_region
do_munmap
vm_munmap
SyS_munmap
do_fast_syscall_32
? __do_page_fault
sysenter_past_esp
Code: 00 e9 05 fe ff ff 90 8d 74 26 00 0f 0b eb fe ba 4c e1 7a c1 89 f8 e8 f0 91 fd ff 0f 0b eb fe ba 6c e1 7a c1 89 f8 e8 e0 91 fd ff <0f> 0b eb fe ba c4 e1 7a c1 89 f8 e8 d0 91 fd ff 0f 0b eb fe 8d
EIP: [<c11bde80>] zap_huge_pmd+0x240/0x260 SS:ESP 0068:f6459d98
---[ end trace cba8fb1fc2e2e78a ]---
The problem is in pmd_pfn_mask() and pmd_flags_mask(). These helpers use
PMD_PAGE_MASK to calculate resulting mask. PMD_PAGE_MASK is 'unsigned
long', not 'unsigned long long' as phys_addr_t. As result upper bits of
resulting mask is truncated.
The patch reworks code to use PMD_SHIFT as base of mask calculation
instead of PMD_PAGE_MASK.
pud_pfn_mask() and pud_flags_mask() aren't problematic since we don't
have PUD page table level on 32-bit systems, but they are reworked too
to be consistent with PMD counterpart.
Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: elliott@hpe.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: JA 1/4 rgen Gross <jgross@suse.com>
Cc: konrad.wilk@oracle.com
Cc: linux-mm <linux-mm@kvack.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: f70abb0fc3da ("x86/asm: Fix pud/pmd interfaces to handle large PAT bit")
Link: http://lkml.kernel.org/r/1447111090-8526-1-git-send-email-kirill.shutemov@linux.intel.com
[ Fix -Woverflow warnings from the realmode code. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
arch/x86/boot/boot.h | 1 -
arch/x86/boot/video-mode.c | 2 ++
arch/x86/boot/video.c | 2 ++
arch/x86/include/asm/pgtable_types.h | 14 ++++----------
arch/x86/include/asm/x86_init.h | 1 -
5 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 0033e96c3f09..9011a88353de 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -23,7 +23,6 @@
#include <stdarg.h>
#include <linux/types.h>
#include <linux/edd.h>
-#include <asm/boot.h>
#include <asm/setup.h>
#include "bitops.h"
#include "ctype.h"
diff --git a/arch/x86/boot/video-mode.c b/arch/x86/boot/video-mode.c
index aa8a96b052e3..95c7a818c0ed 100644
--- a/arch/x86/boot/video-mode.c
+++ b/arch/x86/boot/video-mode.c
@@ -19,6 +19,8 @@
#include "video.h"
#include "vesa.h"
+#include <uapi/asm/boot.h>
+
/*
* Common variables
*/
diff --git a/arch/x86/boot/video.c b/arch/x86/boot/video.c
index 05111bb8d018..77780e386e9b 100644
--- a/arch/x86/boot/video.c
+++ b/arch/x86/boot/video.c
@@ -13,6 +13,8 @@
* Select video mode
*/
+#include <uapi/asm/boot.h>
+
#include "boot.h"
#include "video.h"
#include "vesa.h"
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index dd5b0aa9dd2f..c1e797266ce9 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
static inline pudval_t pud_pfn_mask(pud_t pud)
{
if (native_pud_val(pud) & _PAGE_PSE)
- return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+ return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
else
return PTE_PFN_MASK;
}
static inline pudval_t pud_flags_mask(pud_t pud)
{
- if (native_pud_val(pud) & _PAGE_PSE)
- return ~(PUD_PAGE_MASK & (pudval_t)PHYSICAL_PAGE_MASK);
- else
- return ~PTE_PFN_MASK;
+ return ~pud_pfn_mask(pud);
}
static inline pudval_t pud_flags(pud_t pud)
@@ -300,17 +297,14 @@ static inline pudval_t pud_flags(pud_t pud)
static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
{
if (native_pmd_val(pmd) & _PAGE_PSE)
- return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+ return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
else
return PTE_PFN_MASK;
}
static inline pmdval_t pmd_flags_mask(pmd_t pmd)
{
- if (native_pmd_val(pmd) & _PAGE_PSE)
- return ~(PMD_PAGE_MASK & (pmdval_t)PHYSICAL_PAGE_MASK);
- else
- return ~PTE_PFN_MASK;
+ return ~pmd_pfn_mask(pmd);
}
static inline pmdval_t pmd_flags(pmd_t pmd)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 48d34d28f5a6..cd0fc0cc78bc 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -1,7 +1,6 @@
#ifndef _ASM_X86_PLATFORM_H
#define _ASM_X86_PLATFORM_H
-#include <asm/pgtable_types.h>
#include <asm/bootparam.h>
struct mpc_bus;
--
2.3.5
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-11-11 9:51 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-09 23:18 [PATCH] x86/mm: fix regression with huge pages on PAE Kirill A. Shutemov
2015-11-09 23:43 ` Toshi Kani
2015-11-09 23:57 ` Kirill A. Shutemov
2015-11-10 0:12 ` Toshi Kani
2015-11-10 12:34 ` Borislav Petkov
2015-11-10 13:53 ` Kirill A. Shutemov
2015-11-10 14:46 ` Borislav Petkov
2015-11-10 15:07 ` Kirill A. Shutemov
2015-11-10 17:04 ` Borislav Petkov
2015-11-11 9:51 ` Borislav Petkov [this message]
2015-11-12 7:48 ` Ingo Molnar
2015-11-12 7:57 ` Kirill A. Shutemov
2015-11-12 8:00 ` Ingo Molnar
2015-11-12 8:46 ` Kirill A. Shutemov
2015-11-12 8:54 ` Ingo Molnar
2015-11-12 9:00 ` Kirill A. Shutemov
2015-11-12 13:29 ` Ingo Molnar
2015-11-24 14:59 ` Boris Ostrovsky
2015-11-24 20:14 ` Kirill A. Shutemov
2015-11-25 10:27 ` Borislav Petkov
2015-11-27 10:14 ` Ingo Molnar
2015-11-12 8:55 ` Ingo Molnar
2015-11-12 19:29 ` Linus Torvalds
2015-11-13 9:01 ` Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151111095101.GA22512@pd.tnic \
--to=bp@alien8.de \
--cc=akpm@linux-foundation.org \
--cc=boris.ostrovsky@oracle.com \
--cc=elliott@hpe.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=toshi.kani@hpe.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).