[PATCH] x86: mm: Check if PUD is large when validating a kernel address

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] x86: mm: Check if PUD is large when validating a kernel address
@ 2013-02-11 14:52 ` Mel Gorman
  0 siblings, 0 replies; 23+ messages in thread
From: Mel Gorman @ 2013-02-11 14:52 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: Mel Gorman, linux-kernel, linux-mm

A user reported the following oops when a backup process read
/proc/kcore.

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 6
 Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod

 Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
 RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
 RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
 RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
 RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
 R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
 FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
 Stack:
  ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
  ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
  0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading system RAM
at the 4G mark. On this system, that was the first address using 1G pages
for the virt->phys direct mapping so the PUD is pointing to a physical
address, not a PMD page.  The problem is that the page table walker in
kern_addr_valid() is not checking pud_large() and treats the physical
address as if it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If the data
happens to look like a present PMD though, it will be walked resulting in
the oops above. This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now they are
running the backup program without accessing /proc/kcore so the patch has
not been validated but I think it makes sense. If reviewers agree then it
should also be included in -stable back as far as 3.0-stable.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 arch/x86/include/asm/pgtable.h |    5 +++++
 arch/x86/mm/init_64.c          |    3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] x86: mm: Check if PUD is large when validating a kernel address
@ 2013-02-11 14:52 ` Mel Gorman
  0 siblings, 0 replies; 23+ messages in thread
From: Mel Gorman @ 2013-02-11 14:52 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: Mel Gorman, linux-kernel, linux-mm

A user reported the following oops when a backup process read
/proc/kcore.

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 6
 Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod

 Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
 RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
 RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
 RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
 RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
 R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
 FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
 Stack:
  ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
  ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
  0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading system RAM
at the 4G mark. On this system, that was the first address using 1G pages
for the virt->phys direct mapping so the PUD is pointing to a physical
address, not a PMD page.  The problem is that the page table walker in
kern_addr_valid() is not checking pud_large() and treats the physical
address as if it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If the data
happens to look like a present PMD though, it will be walked resulting in
the oops above. This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now they are
running the backup program without accessing /proc/kcore so the patch has
not been validated but I think it makes sense. If reviewers agree then it
should also be included in -stable back as far as 3.0-stable.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 arch/x86/include/asm/pgtable.h |    5 +++++
 arch/x86/mm/init_64.c          |    3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 ` Mel Gorman
@ 2013-02-11 19:41   ` Rik van Riel
  -1 siblings, 0 replies; 23+ messages in thread
From: Rik van Riel @ 2013-02-11 19:41 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On 02/11/2013 09:52 AM, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
>
>   BUG: unable to handle kernel paging request at ffffbb00ff33b000

> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
>
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.coM>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
@ 2013-02-11 19:41   ` Rik van Riel
  0 siblings, 0 replies; 23+ messages in thread
From: Rik van Riel @ 2013-02-11 19:41 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On 02/11/2013 09:52 AM, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
>
>   BUG: unable to handle kernel paging request at ffffbb00ff33b000

> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
>
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Rik van Riel <riel@redhat.coM>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 ` Mel Gorman
@ 2013-02-12  6:40   ` Johannes Weiner
  -1 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2013-02-12  6:40 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On Mon, Feb 11, 2013 at 02:52:36PM +0000, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
> 
>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  PGD 0
>  Oops: 0000 [#1] SMP
>  CPU 6
>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> 
>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>  Stack:
>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>  Call Trace:
>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>   [<ffffffff81151687>] vfs_read+0xc7/0x130
>   [<ffffffff811517f3>] sys_read+0x53/0xa0
>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> 
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
> 
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Agreed also on the backporting to -stable as far as possible.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
@ 2013-02-12  6:40   ` Johannes Weiner
  0 siblings, 0 replies; 23+ messages in thread
From: Johannes Weiner @ 2013-02-12  6:40 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On Mon, Feb 11, 2013 at 02:52:36PM +0000, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
> 
>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  PGD 0
>  Oops: 0000 [#1] SMP
>  CPU 6
>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> 
>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>  Stack:
>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>  Call Trace:
>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>   [<ffffffff81151687>] vfs_read+0xc7/0x130
>   [<ffffffff811517f3>] sys_read+0x53/0xa0
>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> 
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
> 
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

Agreed also on the backporting to -stable as far as possible.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 ` Mel Gorman
@ 2013-02-12 17:43   ` Michal Hocko
  -1 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2013-02-12 17:43 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On Mon 11-02-13 14:52:36, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
> 
>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  PGD 0
>  Oops: 0000 [#1] SMP
>  CPU 6
>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> 
>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>  Stack:
>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>  Call Trace:
>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>   [<ffffffff81151687>] vfs_read+0xc7/0x130
>   [<ffffffff811517f3>] sys_read+0x53/0xa0
>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> 
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
> 
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/x86/include/asm/pgtable.h |    5 +++++
>  arch/x86/mm/init_64.c          |    3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 5199db2..1c1a955 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>  	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
> +static inline unsigned long pud_pfn(pud_t pud)
> +{
> +	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +}
> +
>  #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
>  
>  static inline int pmd_large(pmd_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..75c9a6a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
>  	if (pud_none(*pud))
>  		return 0;
>  
> +	if (pud_large(*pud))
> +		return pfn_valid(pud_pfn(*pud));
> +
>  	pmd = pmd_offset(pud, addr);
>  	if (pmd_none(*pmd))
>  		return 0;
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address
@ 2013-02-12 17:43   ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2013-02-12 17:43 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm

On Mon 11-02-13 14:52:36, Mel Gorman wrote:
> A user reported the following oops when a backup process read
> /proc/kcore.
> 
>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  PGD 0
>  Oops: 0000 [#1] SMP
>  CPU 6
>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> 
>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>  Stack:
>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>  Call Trace:
>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>   [<ffffffff81151687>] vfs_read+0xc7/0x130
>   [<ffffffff811517f3>] sys_read+0x53/0xa0
>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> 
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages
> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
> 
> Unfortunately the problem was not readily reproducible and now they are
> running the backup program without accessing /proc/kcore so the patch has
> not been validated but I think it makes sense. If reviewers agree then it
> should also be included in -stable back as far as 3.0-stable.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>

Reviewed-by: Michal Hocko <mhocko@suse.cz>

> ---
>  arch/x86/include/asm/pgtable.h |    5 +++++
>  arch/x86/mm/init_64.c          |    3 +++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 5199db2..1c1a955 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>  	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
> +static inline unsigned long pud_pfn(pud_t pud)
> +{
> +	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +}
> +
>  #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
>  
>  static inline int pmd_large(pmd_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..75c9a6a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
>  	if (pud_none(*pud))
>  		return 0;
>  
> +	if (pud_large(*pud))
> +		return pfn_valid(pud_pfn(*pud));
> +
>  	pmd = pmd_offset(pud, addr);
>  	if (pmd_none(*pmd))
>  		return 0;
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-11 14:52 ` Mel Gorman
@ 2013-02-13 11:02   ` Mel Gorman
  -1 siblings, 0 replies; 23+ messages in thread
From: Mel Gorman @ 2013-02-13 11:02 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: linux-kernel, linux-mm, riel, mhocko, hannes

Andrew or Ingo, please pick up.

Changelog since v1
  o Add reviewed-bys and acked-bys

A user reported a bug whereby a backup process accessing /proc/kcore
caused an oops.

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 6
 Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod

 Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
 RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
 RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
 RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
 RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
 R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
 FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
 Stack:
  ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
  ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
  0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading system RAM
at the 4G mark. On this system, that was the first address using 1G pages
for the virt->phys direct mapping so the PUD is pointing to a physical
address, not a PMD page.  The problem is that the page table walker in
kern_addr_valid() is not checking pud_large() and treats the physical
address as if it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If the data
happens to look like a present PMD though, it will be walked resulting in
the oops above. This patch adds the necessary pud_large() check.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 arch/x86/include/asm/pgtable.h |    5 +++++
 arch/x86/mm/init_64.c          |    3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
@ 2013-02-13 11:02   ` Mel Gorman
  0 siblings, 0 replies; 23+ messages in thread
From: Mel Gorman @ 2013-02-13 11:02 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton; +Cc: linux-kernel, linux-mm, riel, mhocko, hannes

Andrew or Ingo, please pick up.

Changelog since v1
  o Add reviewed-bys and acked-bys

A user reported a bug whereby a backup process accessing /proc/kcore
caused an oops.

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 6
 Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod

 Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
 RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
 RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
 RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
 RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
 R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
 R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
 FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
 Stack:
  ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
  ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
  0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading system RAM
at the 4G mark. On this system, that was the first address using 1G pages
for the virt->phys direct mapping so the PUD is pointing to a physical
address, not a PMD page.  The problem is that the page table walker in
kern_addr_valid() is not checking pud_large() and treats the physical
address as if it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If the data
happens to look like a present PMD though, it will be walked resulting in
the oops above. This patch adds the necessary pud_large() check.

Cc: stable@vger.kernel.org
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
---
 arch/x86/include/asm/pgtable.h |    5 +++++
 arch/x86/mm/init_64.c          |    3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-13 11:02   ` Mel Gorman
@ 2013-02-13 11:10     ` Ingo Molnar
  -1 siblings, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2013-02-13 11:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes


* Mel Gorman <mgorman@suse.de> wrote:

> Andrew or Ingo, please pick up.

Already did - will push it out later today.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
@ 2013-02-13 11:10     ` Ingo Molnar
  0 siblings, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2013-02-13 11:10 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes


* Mel Gorman <mgorman@suse.de> wrote:

> Andrew or Ingo, please pick up.

Already did - will push it out later today.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-13 11:10     ` Ingo Molnar
@ 2013-02-13 11:14       ` Mel Gorman
  -1 siblings, 0 replies; 23+ messages in thread
From: Mel Gorman @ 2013-02-13 11:14 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes

On Wed, Feb 13, 2013 at 12:10:31PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > Andrew or Ingo, please pick up.
> 
> Already did - will push it out later today.
> 

Whoops, thanks. Sorry for the noise.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
@ 2013-02-13 11:14       ` Mel Gorman
  0 siblings, 0 replies; 23+ messages in thread
From: Mel Gorman @ 2013-02-13 11:14 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel, linux-mm, riel, mhocko, hannes

On Wed, Feb 13, 2013 at 12:10:31PM +0100, Ingo Molnar wrote:
> 
> * Mel Gorman <mgorman@suse.de> wrote:
> 
> > Andrew or Ingo, please pick up.
> 
> Already did - will push it out later today.
> 

Whoops, thanks. Sorry for the noise.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip:x86/urgent] x86/mm: Check if PUD is large when validating a kernel address
  2013-02-11 14:52 ` Mel Gorman
                   ` (4 preceding siblings ...)
  (?)
@ 2013-02-13 12:12 ` tip-bot for Mel Gorman
  -1 siblings, 0 replies; 23+ messages in thread
From: tip-bot for Mel Gorman @ 2013-02-13 12:12 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, hannes, riel, mgorman, tglx, mhocko

Commit-ID:  0ee364eb316348ddf3e0dfcd986f5f13f528f821
Gitweb:     http://git.kernel.org/tip/0ee364eb316348ddf3e0dfcd986f5f13f528f821
Author:     Mel Gorman <mgorman@suse.de>
AuthorDate: Mon, 11 Feb 2013 14:52:36 +0000
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 13 Feb 2013 10:02:55 +0100

x86/mm: Check if PUD is large when validating a kernel address

A user reported the following oops when a backup process reads
/proc/kcore:

 BUG: unable to handle kernel paging request at ffffbb00ff33b000
 IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
 [...]

 Call Trace:
  [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
  [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
  [<ffffffff81151687>] vfs_read+0xc7/0x130
  [<ffffffff811517f3>] sys_read+0x53/0xa0
  [<ffffffff81449692>] system_call_fastpath+0x16/0x1b

Investigation determined that the bug triggered when reading
system RAM at the 4G mark. On this system, that was the first
address using 1G pages for the virt->phys direct mapping so the
PUD is pointing to a physical address, not a PMD page.

The problem is that the page table walker in kern_addr_valid() is
not checking pud_large() and treats the physical address as if
it was a PMD.  If it happens to look like pmd_none then it'll
silently fail, probably returning zeros instead of real data. If
the data happens to look like a present PMD though, it will be
walked resulting in the oops above.

This patch adds the necessary pud_large() check.

Unfortunately the problem was not readily reproducible and now
they are running the backup program without accessing
/proc/kcore so the patch has not been validated but I think it
makes sense.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Rik van Riel <riel@redhat.coM>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: stable@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/20130211145236.GX21389@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/pgtable.h | 5 +++++
 arch/x86/mm/init_64.c          | 3 +++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 5199db2..1c1a955 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
 	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
+static inline unsigned long pud_pfn(pud_t pud)
+{
+	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
+}
+
 #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
 
 static inline int pmd_large(pmd_t pte)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 2ead3c8..75c9a6a 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
 	if (pud_none(*pud))
 		return 0;
 
+	if (pud_large(*pud))
+		return pfn_valid(pud_pfn(*pud));
+
 	pmd = pmd_offset(pud, addr);
 	if (pmd_none(*pmd))
 		return 0;

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-02-13 11:02   ` Mel Gorman
@ 2013-03-01  6:43     ` Simon Jeons
  -1 siblings, 0 replies; 23+ messages in thread
From: Simon Jeons @ 2013-03-01  6:43 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm, riel, mhocko,
	hannes

On 02/13/2013 07:02 PM, Mel Gorman wrote:
> Andrew or Ingo, please pick up.
>
> Changelog since v1
>    o Add reviewed-bys and acked-bys
>
> A user reported a bug whereby a backup process accessing /proc/kcore
> caused an oops.
>
>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>   PGD 0
>   Oops: 0000 [#1] SMP
>   CPU 6
>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>
>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>   Stack:
>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>   Call Trace:
>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages

Do you mean there is one page which is 1G?

> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> Reviewed-by: Rik van Riel <riel@redhat.com>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>   arch/x86/include/asm/pgtable.h |    5 +++++
>   arch/x86/mm/init_64.c          |    3 +++
>   2 files changed, 8 insertions(+)
>
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 5199db2..1c1a955 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>   	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
>   }
>   
> +static inline unsigned long pud_pfn(pud_t pud)
> +{
> +	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +}
> +
>   #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
>   
>   static inline int pmd_large(pmd_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..75c9a6a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
>   	if (pud_none(*pud))
>   		return 0;
>   
> +	if (pud_large(*pud))
> +		return pfn_valid(pud_pfn(*pud));
> +
>   	pmd = pmd_offset(pud, addr);
>   	if (pmd_none(*pmd))
>   		return 0;
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
@ 2013-03-01  6:43     ` Simon Jeons
  0 siblings, 0 replies; 23+ messages in thread
From: Simon Jeons @ 2013-03-01  6:43 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Andrew Morton, linux-kernel, linux-mm, riel, mhocko,
	hannes

On 02/13/2013 07:02 PM, Mel Gorman wrote:
> Andrew or Ingo, please pick up.
>
> Changelog since v1
>    o Add reviewed-bys and acked-bys
>
> A user reported a bug whereby a backup process accessing /proc/kcore
> caused an oops.
>
>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>   PGD 0
>   Oops: 0000 [#1] SMP
>   CPU 6
>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>
>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>   Stack:
>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>   Call Trace:
>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>
> Investigation determined that the bug triggered when reading system RAM
> at the 4G mark. On this system, that was the first address using 1G pages

Do you mean there is one page which is 1G?

> for the virt->phys direct mapping so the PUD is pointing to a physical
> address, not a PMD page.  The problem is that the page table walker in
> kern_addr_valid() is not checking pud_large() and treats the physical
> address as if it was a PMD.  If it happens to look like pmd_none then it'll
> silently fail, probably returning zeros instead of real data. If the data
> happens to look like a present PMD though, it will be walked resulting in
> the oops above. This patch adds the necessary pud_large() check.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> Reviewed-by: Rik van Riel <riel@redhat.com>
> Reviewed-by: Michal Hocko <mhocko@suse.cz>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>   arch/x86/include/asm/pgtable.h |    5 +++++
>   arch/x86/mm/init_64.c          |    3 +++
>   2 files changed, 8 insertions(+)
>
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 5199db2..1c1a955 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -142,6 +142,11 @@ static inline unsigned long pmd_pfn(pmd_t pmd)
>   	return (pmd_val(pmd) & PTE_PFN_MASK) >> PAGE_SHIFT;
>   }
>   
> +static inline unsigned long pud_pfn(pud_t pud)
> +{
> +	return (pud_val(pud) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +}
> +
>   #define pte_page(pte)	pfn_to_page(pte_pfn(pte))
>   
>   static inline int pmd_large(pmd_t pte)
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 2ead3c8..75c9a6a 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -831,6 +831,9 @@ int kern_addr_valid(unsigned long addr)
>   	if (pud_none(*pud))
>   		return 0;
>   
> +	if (pud_large(*pud))
> +		return pfn_valid(pud_pfn(*pud));
> +
>   	pmd = pmd_offset(pud, addr);
>   	if (pmd_none(*pmd))
>   		return 0;
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  6:43     ` Simon Jeons
  (?)
@ 2013-03-01  9:15     ` Chen Gong
  2013-03-01  9:21         ` Simon Jeons
  -1 siblings, 1 reply; 23+ messages in thread
From: Chen Gong @ 2013-03-01  9:15 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

[-- Attachment #1: Type: text/plain, Size: 3623 bytes --]

On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
> Date: Fri, 01 Mar 2013 14:43:53 +0800
> From: Simon Jeons <simon.jeons@gmail.com>
> To: Mel Gorman <mgorman@suse.de>
> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>  <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>  linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>  kernel address v2
> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>  Thunderbird/17.0.3
> 
> On 02/13/2013 07:02 PM, Mel Gorman wrote:
> >Andrew or Ingo, please pick up.
> >
> >Changelog since v1
> >   o Add reviewed-bys and acked-bys
> >
> >A user reported a bug whereby a backup process accessing /proc/kcore
> >caused an oops.
> >
> >  BUG: unable to handle kernel paging request at ffffbb00ff33b000
> >  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >  PGD 0
> >  Oops: 0000 [#1] SMP
> >  CPU 6
> >  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> >
> >  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
> >  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
> >  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
> >  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
> >  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
> >  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
> >  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
> >  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
> >  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
> >  Stack:
> >   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
> >   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
> >   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
> >  Call Trace:
> >   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
> >   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
> >   [<ffffffff81151687>] vfs_read+0xc7/0x130
> >   [<ffffffff811517f3>] sys_read+0x53/0xa0
> >   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> >
> >Investigation determined that the bug triggered when reading system RAM
> >at the 4G mark. On this system, that was the first address using 1G pages
> 
> Do you mean there is one page which is 1G?
> 
1GB support in native kernel is started from 2.6.27 with these 2 commits:
39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
BTW, IBM System x3550 M3 is a Westmere based system.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  9:15     ` Chen Gong
@ 2013-03-01  9:21         ` Simon Jeons
  0 siblings, 0 replies; 23+ messages in thread
From: Simon Jeons @ 2013-03-01  9:21 UTC (permalink / raw)
  To: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

On 03/01/2013 05:15 PM, Chen Gong wrote:
> On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
>> Date: Fri, 01 Mar 2013 14:43:53 +0800
>> From: Simon Jeons <simon.jeons@gmail.com>
>> To: Mel Gorman <mgorman@suse.de>
>> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>>   <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>   kernel address v2
>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>   Thunderbird/17.0.3
>>
>> On 02/13/2013 07:02 PM, Mel Gorman wrote:
>>> Andrew or Ingo, please pick up.
>>>
>>> Changelog since v1
>>>    o Add reviewed-bys and acked-bys
>>>
>>> A user reported a bug whereby a backup process accessing /proc/kcore
>>> caused an oops.
>>>
>>>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>>>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>   PGD 0
>>>   Oops: 0000 [#1] SMP
>>>   CPU 6
>>>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>>>
>>>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>>>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>>>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>>>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>>>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>>>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>>>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>>>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>>>   Stack:
>>>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>>>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>>>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>>>   Call Trace:
>>>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>>>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>>>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>>>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>>>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>>>
>>> Investigation determined that the bug triggered when reading system RAM
>>> at the 4G mark. On this system, that was the first address using 1G pages
>> Do you mean there is one page which is 1G?
>>
> 1GB support in native kernel is started from 2.6.27 with these 2 commits:

Why call kernel native? Which kend of kernel is not native?

> 39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
> BTW, IBM System x3550 M3 is a Westmere based system.
Is it only used in hugetlbfs page?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
@ 2013-03-01  9:21         ` Simon Jeons
  0 siblings, 0 replies; 23+ messages in thread
From: Simon Jeons @ 2013-03-01  9:21 UTC (permalink / raw)
  To: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

On 03/01/2013 05:15 PM, Chen Gong wrote:
> On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
>> Date: Fri, 01 Mar 2013 14:43:53 +0800
>> From: Simon Jeons <simon.jeons@gmail.com>
>> To: Mel Gorman <mgorman@suse.de>
>> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>>   <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>   kernel address v2
>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>   Thunderbird/17.0.3
>>
>> On 02/13/2013 07:02 PM, Mel Gorman wrote:
>>> Andrew or Ingo, please pick up.
>>>
>>> Changelog since v1
>>>    o Add reviewed-bys and acked-bys
>>>
>>> A user reported a bug whereby a backup process accessing /proc/kcore
>>> caused an oops.
>>>
>>>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>>>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>   PGD 0
>>>   Oops: 0000 [#1] SMP
>>>   CPU 6
>>>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>>>
>>>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>>>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>>>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>>>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>>>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>>>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>>>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>>>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>>>   Stack:
>>>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>>>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>>>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>>>   Call Trace:
>>>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>>>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>>>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>>>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>>>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>>>
>>> Investigation determined that the bug triggered when reading system RAM
>>> at the 4G mark. On this system, that was the first address using 1G pages
>> Do you mean there is one page which is 1G?
>>
> 1GB support in native kernel is started from 2.6.27 with these 2 commits:

Why call kernel native? Which kend of kernel is not native?

> 39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
> BTW, IBM System x3550 M3 is a Westmere based system.
Is it only used in hugetlbfs page?



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  9:21         ` Simon Jeons
  (?)
@ 2013-03-01  9:35         ` Chen Gong
  2013-03-01  9:39             ` Simon Jeons
  -1 siblings, 1 reply; 23+ messages in thread
From: Chen Gong @ 2013-03-01  9:35 UTC (permalink / raw)
  To: Simon Jeons
  Cc: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

[-- Attachment #1: Type: text/plain, Size: 4505 bytes --]

On Fri, Mar 01, 2013 at 05:21:35PM +0800, Simon Jeons wrote:
> Date: Fri, 01 Mar 2013 17:21:35 +0800
> From: Simon Jeons <simon.jeons@gmail.com>
> To: Mel Gorman <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>, Andrew
>  Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>  linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>  kernel address v2
> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>  Thunderbird/17.0.3
> 
> On 03/01/2013 05:15 PM, Chen Gong wrote:
> >On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
> >>Date: Fri, 01 Mar 2013 14:43:53 +0800
> >>From: Simon Jeons <simon.jeons@gmail.com>
> >>To: Mel Gorman <mgorman@suse.de>
> >>CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
> >>  <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
> >>  linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
> >>Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
> >>  kernel address v2
> >>User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
> >>  Thunderbird/17.0.3
> >>
> >>On 02/13/2013 07:02 PM, Mel Gorman wrote:
> >>>Andrew or Ingo, please pick up.
> >>>
> >>>Changelog since v1
> >>>   o Add reviewed-bys and acked-bys
> >>>
> >>>A user reported a bug whereby a backup process accessing /proc/kcore
> >>>caused an oops.
> >>>
> >>>  BUG: unable to handle kernel paging request at ffffbb00ff33b000
> >>>  IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >>>  PGD 0
> >>>  Oops: 0000 [#1] SMP
> >>>  CPU 6
> >>>  Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
> >>>
> >>>  Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
> >>>  RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
> >>>  RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
> >>>  RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
> >>>  RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
> >>>  RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
> >>>  R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
> >>>  R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
> >>>  FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
> >>>  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> >>>  CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
> >>>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >>>  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>  Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
> >>>  Stack:
> >>>   ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
> >>>   ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
> >>>   0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
> >>>  Call Trace:
> >>>   [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
> >>>   [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
> >>>   [<ffffffff81151687>] vfs_read+0xc7/0x130
> >>>   [<ffffffff811517f3>] sys_read+0x53/0xa0
> >>>   [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
> >>>
> >>>Investigation determined that the bug triggered when reading system RAM
> >>>at the 4G mark. On this system, that was the first address using 1G pages
> >>Do you mean there is one page which is 1G?
> >>
> >1GB support in native kernel is started from 2.6.27 with these 2 commits:
> 
> Why call kernel native? Which kend of kernel is not native?
relative to VMM like Xen.

> 
> >39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
> >BTW, IBM System x3550 M3 is a Westmere based system.
> Is it only used in hugetlbfs page?

Yes by now.

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
  2013-03-01  9:35         ` Chen Gong
@ 2013-03-01  9:39             ` Simon Jeons
  0 siblings, 0 replies; 23+ messages in thread
From: Simon Jeons @ 2013-03-01  9:39 UTC (permalink / raw)
  To: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

On 03/01/2013 05:35 PM, Chen Gong wrote:
> On Fri, Mar 01, 2013 at 05:21:35PM +0800, Simon Jeons wrote:
>> Date: Fri, 01 Mar 2013 17:21:35 +0800
>> From: Simon Jeons <simon.jeons@gmail.com>
>> To: Mel Gorman <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>, Andrew
>>   Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>   kernel address v2
>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>   Thunderbird/17.0.3
>>
>> On 03/01/2013 05:15 PM, Chen Gong wrote:
>>> On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
>>>> Date: Fri, 01 Mar 2013 14:43:53 +0800
>>>> From: Simon Jeons <simon.jeons@gmail.com>
>>>> To: Mel Gorman <mgorman@suse.de>
>>>> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>>>>   <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>>>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>>>   kernel address v2
>>>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>>>   Thunderbird/17.0.3
>>>>
>>>> On 02/13/2013 07:02 PM, Mel Gorman wrote:
>>>>> Andrew or Ingo, please pick up.
>>>>>
>>>>> Changelog since v1
>>>>>    o Add reviewed-bys and acked-bys
>>>>>
>>>>> A user reported a bug whereby a backup process accessing /proc/kcore
>>>>> caused an oops.
>>>>>
>>>>>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>>>>>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>>>   PGD 0
>>>>>   Oops: 0000 [#1] SMP
>>>>>   CPU 6
>>>>>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>>>>>
>>>>>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>>>>>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>>>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>>>>>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>>>>>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>>>>>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>>>>>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>>>>>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>>>>>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>>>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>>>>>   Stack:
>>>>>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>>>>>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>>>>>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>>>>>   Call Trace:
>>>>>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>>>>>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>>>>>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>>>>>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>>>>>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>>>>>
>>>>> Investigation determined that the bug triggered when reading system RAM
>>>>> at the 4G mark. On this system, that was the first address using 1G pages
>>>> Do you mean there is one page which is 1G?
>>>>
>>> 1GB support in native kernel is started from 2.6.27 with these 2 commits:
>> Why call kernel native? Which kend of kernel is not native?
> relative to VMM like Xen.

Oh, I see. Thanks. :)

>
>>> 39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
>>> BTW, IBM System x3550 M3 is a Westmere based system.
>> Is it only used in hugetlbfs page?
> Yes by now.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2
@ 2013-03-01  9:39             ` Simon Jeons
  0 siblings, 0 replies; 23+ messages in thread
From: Simon Jeons @ 2013-03-01  9:39 UTC (permalink / raw)
  To: Mel Gorman, Ingo Molnar, Andrew Morton, linux-kernel, linux-mm,
	riel, mhocko, hannes

On 03/01/2013 05:35 PM, Chen Gong wrote:
> On Fri, Mar 01, 2013 at 05:21:35PM +0800, Simon Jeons wrote:
>> Date: Fri, 01 Mar 2013 17:21:35 +0800
>> From: Simon Jeons <simon.jeons@gmail.com>
>> To: Mel Gorman <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>, Andrew
>>   Morton <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>   kernel address v2
>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>   Thunderbird/17.0.3
>>
>> On 03/01/2013 05:15 PM, Chen Gong wrote:
>>> On Fri, Mar 01, 2013 at 02:43:53PM +0800, Simon Jeons wrote:
>>>> Date: Fri, 01 Mar 2013 14:43:53 +0800
>>>> From: Simon Jeons <simon.jeons@gmail.com>
>>>> To: Mel Gorman <mgorman@suse.de>
>>>> CC: Ingo Molnar <mingo@kernel.org>, Andrew Morton
>>>>   <akpm@linux-foundation.org>, linux-kernel@vger.kernel.org,
>>>>   linux-mm@kvack.org, riel@redhat.com, mhocko@suse.cz, hannes@cmpxchg.org
>>>> Subject: Re: [PATCH] x86: mm: Check if PUD is large when validating a
>>>>   kernel address v2
>>>> User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130221
>>>>   Thunderbird/17.0.3
>>>>
>>>> On 02/13/2013 07:02 PM, Mel Gorman wrote:
>>>>> Andrew or Ingo, please pick up.
>>>>>
>>>>> Changelog since v1
>>>>>    o Add reviewed-bys and acked-bys
>>>>>
>>>>> A user reported a bug whereby a backup process accessing /proc/kcore
>>>>> caused an oops.
>>>>>
>>>>>   BUG: unable to handle kernel paging request at ffffbb00ff33b000
>>>>>   IP: [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>>>   PGD 0
>>>>>   Oops: 0000 [#1] SMP
>>>>>   CPU 6
>>>>>   Modules linked in: af_packet nfs lockd fscache auth_rpcgss nfs_acl sunrpc 8021q garp stp llc cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf microcode fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod ioatdma ipv6 ipv6_lib igb dca i7core_edac edac_core i2c_i801 i2c_core cdc_ether usbnet bnx2 mii iTCO_wdt iTCO_vendor_support shpchp rtc_cmos pci_hotplug tpm_tis sg tpm pcspkr tpm_bios serio_raw button ext3 jbd mbcache uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif usb_common processor thermal_sys hwmon scsi_dh_emc scsi_dh_rdac scsi_dh_alua scsi_dh_hp_sw scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
>>>>>
>>>>>   Pid: 16196, comm: Hibackp Not tainted 3.0.13-0.27-default #1 IBM System x3550 M3 -[7944 K3G]-/94Y7614
>>>>>   RIP: 0010:[<ffffffff8103157e>]  [<ffffffff8103157e>] kern_addr_valid+0xbe/0x110
>>>>>   RSP: 0018:ffff88094165fe80  EFLAGS: 00010246
>>>>>   RAX: 00003300ff33b000 RBX: ffff880100000000 RCX: 0000000000000000
>>>>>   RDX: 0000000100000000 RSI: ffff880000000000 RDI: ff32b300ff33b400
>>>>>   RBP: 0000000000001000 R08: 00003ffffffff000 R09: 0000000000000000
>>>>>   R10: 22302e31223d6e6f R11: 0000000000000246 R12: 0000000000001000
>>>>>   R13: 0000000000003000 R14: 0000000000571be0 R15: ffff88094165ff50
>>>>>   FS:  00007ff152d33700(0000) GS:ffff88097f2c0000(0000) knlGS:0000000000000000
>>>>>   CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>>>>>   CR2: ffffbb00ff33b000 CR3: 00000009405a3000 CR4: 00000000000006e0
>>>>>   DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>>   Process Hibackp (pid: 16196, threadinfo ffff88094165e000, task ffff8808eb9ba600)
>>>>>   Stack:
>>>>>    ffffffff811b8aaa 0000000000004000 ffff880943fea480 ffff8808ef2bae50
>>>>>    ffff880943d32980 fffffffffffffffb ffff8808ef2bae40 ffff88094165ff50
>>>>>    0000000000004000 000000000056ebe0 ffffffff811ad847 000000000056ebe0
>>>>>   Call Trace:
>>>>>    [<ffffffff811b8aaa>] read_kcore+0x17a/0x370
>>>>>    [<ffffffff811ad847>] proc_reg_read+0x77/0xc0
>>>>>    [<ffffffff81151687>] vfs_read+0xc7/0x130
>>>>>    [<ffffffff811517f3>] sys_read+0x53/0xa0
>>>>>    [<ffffffff81449692>] system_call_fastpath+0x16/0x1b
>>>>>
>>>>> Investigation determined that the bug triggered when reading system RAM
>>>>> at the 4G mark. On this system, that was the first address using 1G pages
>>>> Do you mean there is one page which is 1G?
>>>>
>>> 1GB support in native kernel is started from 2.6.27 with these 2 commits:
>> Why call kernel native? Which kend of kernel is not native?
> relative to VMM like Xen.

Oh, I see. Thanks. :)

>
>>> 39c11e6 and b4718e6. For Intel CPU, from Westmere it supports 1GB page.
>>> BTW, IBM System x3550 M3 is a Westmere based system.
>> Is it only used in hugetlbfs page?
> Yes by now.


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2013-03-01  9:40 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-11 14:52 [PATCH] x86: mm: Check if PUD is large when validating a kernel address Mel Gorman
2013-02-11 14:52 ` Mel Gorman
2013-02-11 19:41 ` Rik van Riel
2013-02-11 19:41   ` Rik van Riel
2013-02-12  6:40 ` Johannes Weiner
2013-02-12  6:40   ` Johannes Weiner
2013-02-12 17:43 ` Michal Hocko
2013-02-12 17:43   ` Michal Hocko
2013-02-13 11:02 ` [PATCH] x86: mm: Check if PUD is large when validating a kernel address v2 Mel Gorman
2013-02-13 11:02   ` Mel Gorman
2013-02-13 11:10   ` Ingo Molnar
2013-02-13 11:10     ` Ingo Molnar
2013-02-13 11:14     ` Mel Gorman
2013-02-13 11:14       ` Mel Gorman
2013-03-01  6:43   ` Simon Jeons
2013-03-01  6:43     ` Simon Jeons
2013-03-01  9:15     ` Chen Gong
2013-03-01  9:21       ` Simon Jeons
2013-03-01  9:21         ` Simon Jeons
2013-03-01  9:35         ` Chen Gong
2013-03-01  9:39           ` Simon Jeons
2013-03-01  9:39             ` Simon Jeons
2013-02-13 12:12 ` [tip:x86/urgent] x86/mm: Check if PUD is large when validating a kernel address tip-bot for Mel Gorman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.