All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Mel Gorman <mgorman@suse.de>,
	Dave Jones <davej@redhat.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Stephen Wilson <wilsons@start.ca>,
	Christoph Lameter <cl@linux.com>
Subject: [ 25/55] mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages
Date: Sun, 27 May 2012 09:26:38 +0900	[thread overview]
Message-ID: <20120527002617.671149586@linuxfoundation.org> (raw)
In-Reply-To: <20120527005203.GA2146@kroah.com>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mel Gorman <mgorman@suse.de>

commit 05f144a0d5c2207a0349348127f996e104ad7404 upstream.

Dave Jones' system call fuzz testing tool "trinity" triggered the
following bug error with slab debugging enabled

    =============================================================================
    BUG numa_policy (Not tainted): Poison overwritten
    -----------------------------------------------------------------------------

    INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b
    INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
     __slab_alloc+0x3d3/0x445
     kmem_cache_alloc+0x29d/0x2b0
     mpol_new+0xa3/0x140
     sys_mbind+0x142/0x620
     system_call_fastpath+0x16/0x1b
    INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
     __slab_free+0x2e/0x1de
     kmem_cache_free+0x25a/0x260
     __mpol_put+0x27/0x30
     remove_vma+0x68/0x90
     exit_mmap+0x118/0x140
     mmput+0x73/0x110
     exit_mm+0x108/0x130
     do_exit+0x162/0xb90
     do_group_exit+0x4f/0xc0
     sys_exit_group+0x17/0x20
     system_call_fastpath+0x16/0x1b
    INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x          (null) flags=0x20000000004080
    INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0

This implied a reference counting bug and the problem happened during
mbind().

mbind() applies a new memory policy to a range and uses mbind_range() to
merge existing VMAs or split them as necessary.  In the event of splits,
mpol_dup() will allocate a new struct mempolicy and maintain existing
reference counts whose rules are documented in
Documentation/vm/numa_memory_policy.txt .

The problem occurs with shared memory policies.  The vm_op->set_policy
increments the reference count if necessary and split_vma() and
vma_merge() have already handled the existing reference counts.
However, policy_vma() screws it up by replacing an existing
vma->vm_policy with one that potentially has the wrong reference count
leading to a premature free.  This patch removes the damage caused by
policy_vma().

With this patch applied Dave's trinity tool runs an mbind test for 5
minutes without error.  /proc/slabinfo reported that there are no
numa_policy or shared_policy_node objects allocated after the test
completed and the shared memory region was deleted.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Dave Jones <davej@redhat.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Stephen Wilson <wilsons@start.ca>
Cc: Christoph Lameter <cl@linux.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/mempolicy.c |   41 +++++++++++++++++------------------------
 1 file changed, 17 insertions(+), 24 deletions(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -606,27 +606,6 @@ check_range(struct mm_struct *mm, unsign
 	return first;
 }
 
-/* Apply policy to a single VMA */
-static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
-{
-	int err = 0;
-	struct mempolicy *old = vma->vm_policy;
-
-	pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
-		 vma->vm_start, vma->vm_end, vma->vm_pgoff,
-		 vma->vm_ops, vma->vm_file,
-		 vma->vm_ops ? vma->vm_ops->set_policy : NULL);
-
-	if (vma->vm_ops && vma->vm_ops->set_policy)
-		err = vma->vm_ops->set_policy(vma, new);
-	if (!err) {
-		mpol_get(new);
-		vma->vm_policy = new;
-		mpol_put(old);
-	}
-	return err;
-}
-
 /* Step 2: apply policy to a range and do splits. */
 static int mbind_range(struct mm_struct *mm, unsigned long start,
 		       unsigned long end, struct mempolicy *new_pol)
@@ -666,9 +645,23 @@ static int mbind_range(struct mm_struct
 			if (err)
 				goto out;
 		}
-		err = policy_vma(vma, new_pol);
-		if (err)
-			goto out;
+
+		/*
+		 * Apply policy to a single VMA. The reference counting of
+		 * policy for vma_policy linkages has already been handled by
+		 * vma_merge and split_vma as necessary. If this is a shared
+		 * policy then ->set_policy will increment the reference count
+		 * for an sp node.
+		 */
+		pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
+			vma->vm_start, vma->vm_end, vma->vm_pgoff,
+			vma->vm_ops, vma->vm_file,
+			vma->vm_ops ? vma->vm_ops->set_policy : NULL);
+		if (vma->vm_ops && vma->vm_ops->set_policy) {
+			err = vma->vm_ops->set_policy(vma, new_pol);
+			if (err)
+				goto out;
+		}
 	}
 
  out:



  parent reply	other threads:[~2012-05-27  0:59 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-27  0:52 [ 00/55] 3.0.33-stable review Greg KH
2012-05-27  0:26 ` [ 01/55] tilegx: enable SYSCALL_WRAPPERS support Greg KH
2012-05-27  0:26 ` [ 02/55] block: fix buffer overflow when printing partition UUIDs Greg KH
2012-05-27  0:26 ` [ 03/55] block: dont mark buffers beyond end of disk as mapped Greg KH
2012-05-27  0:26 ` [ 04/55] PARISC: fix PA1.1 oops on boot Greg KH
2012-05-27  0:26 ` [ 05/55] PARISC: fix crash in flush_icache_page_asm on PA1.1 Greg KH
2012-05-27  0:26 ` [ 06/55] PARISC: fix panic on prefetch(NULL) on PA7300LC Greg KH
2012-05-27  0:26 ` [ 07/55] isdn/gigaset: ratelimit CAPI message dumps Greg KH
2012-05-27  0:26 ` [ 08/55] vfs: make AIO use the proper rw_verify_area() area helpers Greg KH
2012-05-27  0:26 ` [ 09/55] cfg80211: warn if db.txt is empty with CONFIG_CFG80211_INTERNAL_REGDB Greg KH
2012-05-27  0:26 ` [ 10/55] Fix blocking allocations called very early during bootup Greg KH
2012-05-27  0:26 ` [ 11/55] s390/pfault: fix task state race Greg KH
2012-05-27  0:26 ` [ 12/55] SCSI: mpt2sas: Fix for panic happening because of improper memory allocation Greg KH
2012-05-27  0:26 ` [ 13/55] RDMA/cxgb4: Drop peer_abort when no endpoint found Greg KH
2012-05-27  0:26 ` [ 14/55] KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat Greg KH
2012-05-27  0:26 ` [ 15/55] SELinux: if sel_make_bools errors dont leave inconsistent state Greg KH
2012-05-27  0:26 ` [ 16/55] drivers/staging/comedi/comedi_fops.c: add missing vfree Greg KH
2012-05-27  0:26 ` [ 17/55] perf/x86: Update event scheduling constraints for AMD family 15h models Greg KH
2012-05-27  0:26 ` [ 18/55] mtd: sm_ftl: fix typo in major number Greg KH
2012-05-27  0:26 ` [ 19/55] ahci: Detect Marvell 88SE9172 SATA controller Greg KH
2012-05-27  0:26 ` [ 20/55] um: Fix __swp_type() Greg KH
2012-05-27  0:26 ` [ 21/55] um: Implement a custom pte_same() function Greg KH
2012-05-27  0:26 ` [ 22/55] docs: update HOWTO for 2.6.x -> 3.x versioning Greg KH
2012-05-27  0:26 ` [ 23/55] USB: cdc-wdm: poll must return POLLHUP if device is gone Greg KH
2012-05-27  0:26 ` [ 24/55] workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is active Greg KH
2012-05-27  0:26 ` Greg KH [this message]
2012-05-27  0:26 ` [ 26/55] md: using GFP_NOIO to allocate bio for flush request Greg KH
2012-05-27  0:26 ` [ 27/55] Add missing call to uart_update_timeout() Greg KH
2012-05-27  0:26 ` [ 28/55] tty: Allow uart_register/unregister/register Greg KH
2012-05-27  0:26 ` [ 29/55] USB: ftdi-sio: add support for Physik Instrumente E-861 Greg KH
2012-05-27  0:26 ` [ 30/55] usb-storage: unusual_devs entry for Yarvik PMP400 MP4 player Greg KH
2012-05-27  0:26 ` [ 31/55] USB: ffs-test: fix length argument of out function call Greg KH
2012-05-27  0:26 ` [ 32/55] drivers/rtc/rtc-pl031.c: configure correct wday for 2000-01-01 Greg KH
2012-05-27  0:26 ` [ 33/55] SCSI: hpsa: Fix problem with MSA2xxx devices Greg KH
2012-05-27  0:26 ` [ 34/55] usb: usbtest: two super speed fixes for usbtest Greg KH
2012-05-27  0:26 ` [ 35/55] USB: Remove races in devio.c Greg KH
2012-05-27  0:26 ` [ 36/55] USB: serial: ti_usb_3410_5052: Add support for the FRI2 serial console Greg KH
2012-05-27  0:26 ` [ 37/55] usb: gadget: fsl_udc_core: dTDs next dtd pointer need to be updated once written Greg KH
2012-05-27  0:26 ` [ 38/55] usb: add USB_QUIRK_RESET_RESUME for M-Audio 88es Greg KH
2012-05-27  0:26 ` [ 39/55] xhci: Add Lynx Point to list of Intel switchable hosts Greg KH
2012-05-27  0:26 ` [ 40/55] usb-xhci: Handle COMP_TX_ERR for isoc tds Greg KH
2012-05-27  0:26 ` [ 41/55] xhci: Reset reserved command ring TRBs on cleanup Greg KH
2012-05-27  0:26 ` [ 42/55] xhci: Add new short TX quirk for Fresco Logic host Greg KH
2012-05-27  0:26 ` [ 43/55] drm/i915: Avoid a double-read of PCH_IIR during interrupt handling Greg KH
2012-05-27  0:26 ` [ 44/55] drm/i915: [GEN7] Use HW scheduler for fixed function shaders Greg KH
2012-05-27  0:26 ` [ 45/55] drm/i915: dont clobber the pipe param in sanitize_modesetting Greg KH
2012-05-27  0:26 ` [ 46/55] nouveau: nouveau_set_bo_placement takes TTM flags Greg KH
2012-05-27  0:27 ` [ 47/55] [media] smsusb: add autodetection support for USB ID 2040:c0a0 Greg KH
2012-05-27  0:27 ` [ 48/55] media: uvcvideo: Fix ENUMINPUT handling Greg KH
2012-05-27  0:27 ` [ 49/55] x86/mce: Fix check for processor context when machine check was taken Greg KH
2012-05-27  0:27 ` [ 50/55] mmc: sdio: avoid spurious calls to interrupt handlers Greg KH
2012-05-27  0:27 ` [ 51/55] tile: fix bug where fls(0) was not returning 0 Greg KH
2012-05-27  0:27 ` [ 52/55] isci: fix oem parameter validation on single controller skus Greg KH
2012-05-27  0:27 ` [ 53/55] ARM: 7365/1: drop unused parameter from flush_cache_user_range Greg KH
2012-05-27  0:27 ` [ 54/55] ARM: 7409/1: Do not call flush_cache_user_range with mmap_sem held Greg KH
2012-05-27  0:27 ` [ 55/55] i2c: davinci: Free requested IRQ in remove Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120527002617.671149586@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=cl@linux.com \
    --cc=davej@redhat.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wilsons@start.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.