All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Gu Zheng <guz.fnst@cn.fujitsu.com>,
	Taku Izumi <izumi.taku@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Tang Chen <tangchen@cn.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 26/29] mm/memory_hotplug.c: set zone->wait_table to null after freeing it
Date: Fri, 19 Jun 2015 13:36:46 -0700	[thread overview]
Message-ID: <20150619203558.273449253@linuxfoundation.org> (raw)
In-Reply-To: <20150619203557.356558223@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

commit 85bd839983778fcd0c1c043327b14a046e979b39 upstream.

Izumi found the following oops when hot re-adding a node:

    BUG: unable to handle kernel paging request at ffffc90008963690
    IP: __wake_up_bit+0x20/0x70
    Oops: 0000 [#1] SMP
    CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
    Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
    task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
    RIP: 0010:[<ffffffff810dff80>]  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
    RSP: 0018:ffff880017b97be8  EFLAGS: 00010246
    RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
    RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
    RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
    R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
    FS:  00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
    Call Trace:
      unlock_page+0x6d/0x70
      generic_write_end+0x53/0xb0
      xfs_vm_write_end+0x29/0x80 [xfs]
      generic_perform_write+0x10a/0x1e0
      xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
      xfs_file_write_iter+0x79/0x120 [xfs]
      __vfs_write+0xd4/0x110
      vfs_write+0xac/0x1c0
      SyS_write+0x58/0xd0
      system_call_fastpath+0x12/0x76
    Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
    RIP  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
     RSP <ffff880017b97be8>
    CR2: ffffc90008963690

Reproduce method (re-add a node)::
  Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)

This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.

When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized.  The
judgement of zone initialized is based on zone->wait_table:

	static inline bool zone_is_initialized(struct zone *zone)
	{
		return !!zone->wait_table;
	}

so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/memory_hotplug.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1803,8 +1803,10 @@ void try_offline_node(int nid)
 		 * wait_table may be allocated from boot memory,
 		 * here only free if it's allocated by vmalloc.
 		 */
-		if (is_vmalloc_addr(zone->wait_table))
+		if (is_vmalloc_addr(zone->wait_table)) {
 			vfree(zone->wait_table);
+			zone->wait_table = NULL;
+		}
 	}
 }
 EXPORT_SYMBOL(try_offline_node);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
Please read the FAQ at  http://www.tux.org/lkml/

WARNING: multiple messages have this Message-ID (diff)
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Gu Zheng <guz.fnst@cn.fujitsu.com>,
	Taku Izumi <izumi.taku@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Tang Chen <tangchen@cn.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.10 26/29] mm/memory_hotplug.c: set zone->wait_table to null after freeing it
Date: Fri, 19 Jun 2015 13:36:46 -0700	[thread overview]
Message-ID: <20150619203558.273449253@linuxfoundation.org> (raw)
In-Reply-To: <20150619203557.356558223@linuxfoundation.org>

3.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Gu Zheng <guz.fnst@cn.fujitsu.com>

commit 85bd839983778fcd0c1c043327b14a046e979b39 upstream.

Izumi found the following oops when hot re-adding a node:

    BUG: unable to handle kernel paging request at ffffc90008963690
    IP: __wake_up_bit+0x20/0x70
    Oops: 0000 [#1] SMP
    CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
    Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
    task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
    RIP: 0010:[<ffffffff810dff80>]  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
    RSP: 0018:ffff880017b97be8  EFLAGS: 00010246
    RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
    RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
    RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
    R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
    FS:  00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
    Call Trace:
      unlock_page+0x6d/0x70
      generic_write_end+0x53/0xb0
      xfs_vm_write_end+0x29/0x80 [xfs]
      generic_perform_write+0x10a/0x1e0
      xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
      xfs_file_write_iter+0x79/0x120 [xfs]
      __vfs_write+0xd4/0x110
      vfs_write+0xac/0x1c0
      SyS_write+0x58/0xd0
      system_call_fastpath+0x12/0x76
    Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
    RIP  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
     RSP <ffff880017b97be8>
    CR2: ffffc90008963690

Reproduce method (re-add a node)::
  Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)

This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.

When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized.  The
judgement of zone initialized is based on zone->wait_table:

	static inline bool zone_is_initialized(struct zone *zone)
	{
		return !!zone->wait_table;
	}

so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 mm/memory_hotplug.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1803,8 +1803,10 @@ void try_offline_node(int nid)
 		 * wait_table may be allocated from boot memory,
 		 * here only free if it's allocated by vmalloc.
 		 */
-		if (is_vmalloc_addr(zone->wait_table))
+		if (is_vmalloc_addr(zone->wait_table)) {
 			vfree(zone->wait_table);
+			zone->wait_table = NULL;
+		}
 	}
 }
 EXPORT_SYMBOL(try_offline_node);


--
To unsubscribe from this list: send the line "unsubscribe stable" in

  parent reply	other threads:[~2015-06-19 20:51 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-19 20:36 [PATCH 3.10 00/29] 3.10.81-stable review Greg Kroah-Hartman
2015-06-19 20:36 ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 01/29] net: phy: Allow EEE for all RGMII variants Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 02/29] ipv4: Avoid crashing in ip_error Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 03/29] bridge: fix parsing of MLDv2 reports Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 04/29] net: dp83640: fix broken calibration routine Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 05/29] unix/caif: sk_socket can disappear when state is unlocked Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 06/29] net_sched: invoke ->attach() after setting dev->qdisc Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 07/29] udp: fix behavior of wrong checksums Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 08/29] xen: netback: read hotplug script once at start of day Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 09/29] iio: adis16400: Report pressure channel scale Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 10/29] iio: adis16400: Use != channel indices for the two voltage channels Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 11/29] iio: adis16400: Compute the scan mask from channel indices Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 12/29] ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420 Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 13/29] ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 14/29] ALSA: usb-audio: add MAYA44 USB+ mixer control names Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 16/29] block: fix ext_dev_lock lockdep report Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 17/29] USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 18/29] USB: serial: ftdi_sio: Add support for a Motion Tracker Development Board Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 19/29] ring-buffer-benchmark: Fix the wrong sched_priority of producer Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 20/29] MIPS: Fix enabling of DEBUG_STACKOVERFLOW Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 21/29] ozwpan: Use proper check to prevent heap overflow Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 22/29] ozwpan: divide-by-zero leading to panic Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 23/29] ozwpan: unchecked signed subtraction leads to DoS Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 24/29] pata_octeon_cf: fix broken build Greg Kroah-Hartman
2015-06-19 20:36 ` Greg Kroah-Hartman [this message]
2015-06-19 20:36   ` [PATCH 3.10 26/29] mm/memory_hotplug.c: set zone->wait_table to null after freeing it Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 27/29] cfg80211: wext: clear sinfo struct before calling driver Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 28/29] btrfs: incorrect handling for fiemap_fill_next_extent return Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-19 20:36 ` [PATCH 3.10 29/29] btrfs: cleanup orphans while looking up default subvolume Greg Kroah-Hartman
2015-06-19 20:36   ` Greg Kroah-Hartman
2015-06-20  1:13 ` [PATCH 3.10 00/29] 3.10.81-stable review Shuah Khan
2015-06-20  1:13   ` Shuah Khan
2015-06-20  1:25 ` Guenter Roeck
2015-06-20  1:25   ` Guenter Roeck
2015-06-20  7:50 ` Sudip Mukherjee
2015-06-20  7:50   ` Sudip Mukherjee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150619203558.273449253@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=guz.fnst@cn.fujitsu.com \
    --cc=izumi.taku@jp.fujitsu.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tangchen@cn.fujitsu.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.