public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Cliff Wickman <cpw@sgi.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Mel Gorman <mel@csn.ul.ie>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	David Sterba <dsterba@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	KOSAKI Motohiro <kosaki.motohiro@gmail.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [ 26/44] mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas
Date: Wed,  5 Jun 2013 14:12:24 -0700	[thread overview]
Message-ID: <20130605211224.615407261@linuxfoundation.org> (raw)
In-Reply-To: <20130605211221.858177087@linuxfoundation.org>

3.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Cliff Wickman <cpw@sgi.com>

commit a9ff785e4437c83d2179161e012f5bdfbd6381f0 upstream.

A panic can be caused by simply cat'ing /proc/<pid>/smaps while an
application has a VM_PFNMAP range.  It happened in-house when a
benchmarker was trying to decipher the memory layout of his program.

/proc/<pid>/smaps and similar walks through a user page table should not
be looking at VM_PFNMAP areas.

Certain tests in walk_page_range() (specifically split_huge_page_pmd())
assume that all the mapped PFN's are backed with page structures.  And
this is not usually true for VM_PFNMAP areas.  This can result in panics
on kernel page faults when attempting to address those page structures.

There are a half dozen callers of walk_page_range() that walk through a
task's entire page table (as N.  Horiguchi pointed out).  So rather than
change all of them, this patch changes just walk_page_range() to ignore
VM_PFNMAP areas.

The logic of hugetlb_vma() is moved back into walk_page_range(), as we
want to test any vma in the range.

VM_PFNMAP areas are used by:
- graphics memory manager   gpu/drm/drm_gem.c
- global reference unit     sgi-gru/grufile.c
- sgi special memory        char/mspec.c
- and probably several out-of-tree modules

[akpm@linux-foundation.org: remove now-unused hugetlb_vma() stub]
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/pagewalk.c |   70 +++++++++++++++++++++++++++++-----------------------------
 1 file changed, 36 insertions(+), 34 deletions(-)

--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -127,28 +127,7 @@ static int walk_hugetlb_range(struct vm_
 	return 0;
 }
 
-static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk)
-{
-	struct vm_area_struct *vma;
-
-	/* We don't need vma lookup at all. */
-	if (!walk->hugetlb_entry)
-		return NULL;
-
-	VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem));
-	vma = find_vma(walk->mm, addr);
-	if (vma && vma->vm_start <= addr && is_vm_hugetlb_page(vma))
-		return vma;
-
-	return NULL;
-}
-
 #else /* CONFIG_HUGETLB_PAGE */
-static struct vm_area_struct* hugetlb_vma(unsigned long addr, struct mm_walk *walk)
-{
-	return NULL;
-}
-
 static int walk_hugetlb_range(struct vm_area_struct *vma,
 			      unsigned long addr, unsigned long end,
 			      struct mm_walk *walk)
@@ -199,30 +178,53 @@ int walk_page_range(unsigned long addr,
 	if (!walk->mm)
 		return -EINVAL;
 
+	VM_BUG_ON(!rwsem_is_locked(&walk->mm->mmap_sem));
+
 	pgd = pgd_offset(walk->mm, addr);
 	do {
-		struct vm_area_struct *vma;
+		struct vm_area_struct *vma = NULL;
 
 		next = pgd_addr_end(addr, end);
 
 		/*
-		 * handle hugetlb vma individually because pagetable walk for
-		 * the hugetlb page is dependent on the architecture and
-		 * we can't handled it in the same manner as non-huge pages.
+		 * This function was not intended to be vma based.
+		 * But there are vma special cases to be handled:
+		 * - hugetlb vma's
+		 * - VM_PFNMAP vma's
 		 */
-		vma = hugetlb_vma(addr, walk);
+		vma = find_vma(walk->mm, addr);
 		if (vma) {
-			if (vma->vm_end < next)
+			/*
+			 * There are no page structures backing a VM_PFNMAP
+			 * range, so do not allow split_huge_page_pmd().
+			 */
+			if ((vma->vm_start <= addr) &&
+			    (vma->vm_flags & VM_PFNMAP)) {
 				next = vma->vm_end;
+				pgd = pgd_offset(walk->mm, next);
+				continue;
+			}
 			/*
-			 * Hugepage is very tightly coupled with vma, so
-			 * walk through hugetlb entries within a given vma.
+			 * Handle hugetlb vma individually because pagetable
+			 * walk for the hugetlb page is dependent on the
+			 * architecture and we can't handled it in the same
+			 * manner as non-huge pages.
 			 */
-			err = walk_hugetlb_range(vma, addr, next, walk);
-			if (err)
-				break;
-			pgd = pgd_offset(walk->mm, next);
-			continue;
+			if (walk->hugetlb_entry && (vma->vm_start <= addr) &&
+			    is_vm_hugetlb_page(vma)) {
+				if (vma->vm_end < next)
+					next = vma->vm_end;
+				/*
+				 * Hugepage is very tightly coupled with vma,
+				 * so walk through hugetlb entries within a
+				 * given vma.
+				 */
+				err = walk_hugetlb_range(vma, addr, next, walk);
+				if (err)
+					break;
+				pgd = pgd_offset(walk->mm, next);
+				continue;
+			}
 		}
 
 		if (pgd_none_or_clear_bad(pgd)) {



  parent reply	other threads:[~2013-06-05 21:12 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-05 21:11 [ 00/44] 3.4.48-stable review Greg Kroah-Hartman
2013-06-05 21:11 ` [ 01/44] avr32: fix relocation check for signed 18-bit offset Greg Kroah-Hartman
2013-06-05 21:12 ` [ 02/44] ARM: plat-orion: Fix num_resources and id for ge10 and ge11 Greg Kroah-Hartman
2013-06-05 21:12 ` [ 03/44] staging: vt6656: use free_netdev instead of kfree Greg Kroah-Hartman
2013-06-05 21:12 ` [ 04/44] usb: option: Add Telewell TW-LTE 4G Greg Kroah-Hartman
2013-06-05 21:12 ` [ 05/44] USB: option: add device IDs for Dell 5804 (Novatel E371) WWAN card Greg Kroah-Hartman
2013-06-05 21:12 ` [ 06/44] USB: ftdi_sio: Add support for Newport CONEX motor drivers Greg Kroah-Hartman
2013-06-05 21:12 ` [ 07/44] USB: cxacru: potential underflow in cxacru_cm_get_array() Greg Kroah-Hartman
2013-06-05 21:12 ` [ 08/44] TTY: Fix tty miss restart after we turn off flow-control Greg Kroah-Hartman
2013-06-05 21:12 ` [ 09/44] USB: Blacklisted Cinterions PLxx WWAN Interface Greg Kroah-Hartman
2013-06-05 21:12 ` [ 10/44] USB: reset resume quirk needed by a hub Greg Kroah-Hartman
2013-06-05 21:12 ` [ 11/44] USB: xHCI: override bogus bulk wMaxPacketSize values Greg Kroah-Hartman
2013-06-05 21:12 ` [ 12/44] USB: UHCI: fix for suspend of virtual HP controller Greg Kroah-Hartman
2013-06-05 21:12 ` [ 13/44] cifs: only set ops for inodes in I_NEW state Greg Kroah-Hartman
2013-06-05 21:12 ` [ 14/44] fat: fix possible overflow for fat_clusters Greg Kroah-Hartman
2013-06-05 21:12 ` [ 15/44] perf: net_dropmonitor: Fix trace parameter order Greg Kroah-Hartman
2013-06-05 21:12 ` [ 16/44] perf: net_dropmonitor: Fix symbol-relative addresses Greg Kroah-Hartman
2013-06-05 21:12 ` [ 17/44] ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap() Greg Kroah-Hartman
2013-06-05 21:12 ` [ 18/44] Kirkwood: Enable PCIe port 1 on QNAP TS-11x/TS-21x Greg Kroah-Hartman
2013-06-05 21:12 ` [ 19/44] drivers/leds/leds-ot200.c: fix error caused by shifted mask Greg Kroah-Hartman
2013-06-05 21:12 ` [ 20/44] mm compaction: fix of improper cache flush in migration code Greg Kroah-Hartman
2013-06-05 21:12 ` [ 21/44] klist: del waiter from klist_remove_waiters before wakeup waitting process Greg Kroah-Hartman
2013-06-05 21:12 ` [ 22/44] wait: fix false timeouts when using wait_event_timeout() Greg Kroah-Hartman
2013-06-05 21:12 ` [ 23/44] nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary Greg Kroah-Hartman
2013-06-05 21:12 ` [ 24/44] mm: mmu_notifier: re-fix freed page still mapped in secondary MMU Greg Kroah-Hartman
2013-06-05 21:12 ` [ 25/44] drivers/block/brd.c: fix brd_lookup_page() race Greg Kroah-Hartman
2013-06-05 21:12 ` Greg Kroah-Hartman [this message]
2013-06-05 21:12 ` [ 27/44] mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer Greg Kroah-Hartman
2013-06-05 21:12 ` [ 28/44] iscsi-target: fix heap buffer overflow on error Greg Kroah-Hartman
2013-06-05 21:12 ` [ 29/44] NFSv4: Fix a thinko in nfs4_try_open_cached Greg Kroah-Hartman
2013-06-05 21:12 ` [ 30/44] xfs: kill suid/sgid through the truncate path Greg Kroah-Hartman
2013-06-05 21:12 ` [ 31/44] drm/radeon: fix card_posted check for newer asics Greg Kroah-Hartman
2013-06-05 21:12 ` [ 32/44] cifs: fix potential buffer overrun when composing a new options string Greg Kroah-Hartman
2013-06-05 21:12 ` [ 33/44] USB: io_ti: Fix NULL dereference in chase_port() Greg Kroah-Hartman
2013-06-05 21:12 ` [ 34/44] ata_piix: add PCI IDs for Intel BayTail Greg Kroah-Hartman
2013-06-05 21:12 ` [ 35/44] libata: make ata_exec_internal_sg honor DMADIR Greg Kroah-Hartman
2013-06-05 21:12 ` [ 36/44] m68k/mac: Fix unexpected interrupt with CONFIG_EARLY_PRINTK Greg Kroah-Hartman
2013-06-05 21:12 ` [ 37/44] xen/events: Handle VIRQ_TIMER before any other hardirq in event loop Greg Kroah-Hartman
2013-06-05 21:12 ` [ 38/44] jfs: fix a couple races Greg Kroah-Hartman
2013-06-05 21:12 ` [ 39/44] xen-netback: remove skb in xen_netbk_alloc_page Greg Kroah-Hartman
2013-06-05 21:12 ` [ 40/44] iommu/amd: Re-enable IOMMU event log interrupt after handling Greg Kroah-Hartman
2013-06-05 21:12 ` [ 41/44] iommu/amd: Workaround for ERBT1312 Greg Kroah-Hartman
2013-06-05 21:12 ` [ 42/44] x86, um: Correct syscall table type attributes breaking gcc 4.8 Greg Kroah-Hartman
2013-06-05 21:12 ` [ 43/44] mac80211: close AP_VLAN interfaces before unregistering all Greg Kroah-Hartman
2013-06-05 21:12 ` [ 44/44] thinkpad-acpi: recognize latest V-Series using DMI_BIOS_VENDOR Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130605211224.615407261@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cpw@sgi.com \
    --cc=dave.hansen@intel.com \
    --cc=dsterba@suse.cz \
    --cc=hannes@cmpxchg.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kosaki.motohiro@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mel@csn.ul.ie \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox