From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Toshi Kani <toshi.kani@hpe.com>,
Zhang Zhen <zhenzhang.zhang@huawei.com>,
Reza Arbab <arbab@linux.vnet.ibm.com>,
David Rientjes <rientjes@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.9 28/66] base/memory, hotplug: fix a kernel oops in show_valid_zones()
Date: Tue, 7 Feb 2017 13:59:02 +0100 [thread overview]
Message-ID: <20170207124529.518813323@linuxfoundation.org> (raw)
In-Reply-To: <20170207124528.281881183@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toshi Kani <toshi.kani@hpe.com>
commit a96dfddbcc04336bbed50dc2b24823e45e09e80c upstream.
Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
BUG: unable to handle kernel paging request at ffffea017a000000
IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.
[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
drivers/base/memory.c | 12 ++++++------
include/linux/memory_hotplug.h | 3 ++-
mm/memory_hotplug.c | 20 +++++++++++++++-----
3 files changed, 23 insertions(+), 12 deletions(-)
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -391,33 +391,33 @@ static ssize_t show_valid_zones(struct d
{
struct memory_block *mem = to_memory_block(dev);
unsigned long start_pfn, end_pfn;
+ unsigned long valid_start, valid_end, valid_pages;
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
- struct page *first_page;
struct zone *zone;
int zone_shift = 0;
start_pfn = section_nr_to_pfn(mem->start_section_nr);
end_pfn = start_pfn + nr_pages;
- first_page = pfn_to_page(start_pfn);
/* The block contains more than one zone can not be offlined. */
- if (!test_pages_in_a_zone(start_pfn, end_pfn))
+ if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
return sprintf(buf, "none\n");
- zone = page_zone(first_page);
+ zone = page_zone(pfn_to_page(valid_start));
+ valid_pages = valid_end - valid_start;
/* MMOP_ONLINE_KEEP */
sprintf(buf, "%s", zone->name);
/* MMOP_ONLINE_KERNEL */
- zone_can_shift(start_pfn, nr_pages, ZONE_NORMAL, &zone_shift);
+ zone_can_shift(valid_start, valid_pages, ZONE_NORMAL, &zone_shift);
if (zone_shift) {
strcat(buf, " ");
strcat(buf, (zone + zone_shift)->name);
}
/* MMOP_ONLINE_MOVABLE */
- zone_can_shift(start_pfn, nr_pages, ZONE_MOVABLE, &zone_shift);
+ zone_can_shift(valid_start, valid_pages, ZONE_MOVABLE, &zone_shift);
if (zone_shift) {
strcat(buf, " ");
strcat(buf, (zone + zone_shift)->name);
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -85,7 +85,8 @@ extern int zone_grow_waitqueues(struct z
extern int add_one_highpage(struct page *page, int pfn, int bad_ppro);
/* VM interface that may be used by firmware interface */
extern int online_pages(unsigned long, unsigned long, int);
-extern int test_pages_in_a_zone(unsigned long, unsigned long);
+extern int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
+ unsigned long *valid_start, unsigned long *valid_end);
extern void __offline_isolated_pages(unsigned long, unsigned long);
typedef void (*online_page_callback_t)(struct page *page);
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1484,10 +1484,13 @@ bool is_mem_section_removable(unsigned l
/*
* Confirm all pages in a range [start, end) belong to the same zone.
+ * When true, return its valid [start, end).
*/
-int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn)
+int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
+ unsigned long *valid_start, unsigned long *valid_end)
{
unsigned long pfn, sec_end_pfn;
+ unsigned long start, end;
struct zone *zone = NULL;
struct page *page;
int i;
@@ -1509,14 +1512,20 @@ int test_pages_in_a_zone(unsigned long s
page = pfn_to_page(pfn + i);
if (zone && page_zone(page) != zone)
return 0;
+ if (!zone)
+ start = pfn + i;
zone = page_zone(page);
+ end = pfn + MAX_ORDER_NR_PAGES;
}
}
- if (zone)
+ if (zone) {
+ *valid_start = start;
+ *valid_end = end;
return 1;
- else
+ } else {
return 0;
+ }
}
/*
@@ -1863,6 +1872,7 @@ static int __ref __offline_pages(unsigne
long offlined_pages;
int ret, drain, retry_max, node;
unsigned long flags;
+ unsigned long valid_start, valid_end;
struct zone *zone;
struct memory_notify arg;
@@ -1873,10 +1883,10 @@ static int __ref __offline_pages(unsigne
return -EINVAL;
/* This makes hotplug much easier...and readable.
we assume this for now. .*/
- if (!test_pages_in_a_zone(start_pfn, end_pfn))
+ if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
return -EINVAL;
- zone = page_zone(pfn_to_page(start_pfn));
+ zone = page_zone(pfn_to_page(valid_start));
node = zone_to_nid(zone);
nr_pages = end_pfn - start_pfn;
next prev parent reply other threads:[~2017-02-07 12:59 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-07 12:58 [PATCH 4.9 00/66] 4.9.9-stable review Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 01/66] PCI/ASPM: Handle PCI-to-PCIe bridges as roots of PCIe hierarchies Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 02/66] ext4: validate s_first_meta_bg at mount time Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 03/66] x86/efi: Always map the first physical page into the EFI pagetables Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 04/66] efi/fdt: Avoid FDT manipulation after ExitBootServices() Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 05/66] xtensa: fix noMMU build on cores with MMU Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 06/66] HID: cp2112: fix sleep-while-atomic Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 07/66] HID: cp2112: fix gpio-callback error handling Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 08/66] pinctrl: baytrail: Add missing spinlock usage in byt_gpio_irq_handler Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 10/66] drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215 Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 11/66] drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 12/66] crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 13/66] crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 14/66] perf/core: Fix use-after-free bug Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 15/66] perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 16/66] ata: sata_mv:- Handle return value of devm_ioremap Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 18/66] libata: Fix ATA request sense Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 19/66] powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe() Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 20/66] powerpc: Add missing error check to prom_find_boot_cpu() Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 21/66] powerpc: Fix build failure with clang due to BUILD_BUG_ON() Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 22/66] powerpc/mm: Use the correct pointer when setting a 2MB pte Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 23/66] NFSD: Fix a null reference case in find_or_create_lock_stateid() Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 24/66] svcrpc: fix oops in absence of krb5 module Greg Kroah-Hartman
2017-02-07 12:58 ` [PATCH 4.9 25/66] zswap: disable changing params if init fails Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 26/66] cifs: initialize file_info_lock Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 27/66] mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() Greg Kroah-Hartman
2017-02-07 12:59 ` Greg Kroah-Hartman [this message]
2017-02-07 12:59 ` [PATCH 4.9 29/66] mm, fs: check for fatal signals in do_generic_file_read() Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 30/66] tracing: Fix hwlat kthread migration Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 31/66] can: bcm: fix hrtimer/tasklet termination in bcm op removal Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 32/66] cgroup: dont online subsystems before cgroup_name/path() are operational Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 33/66] mmc: sdhci: Ignore unexpected CARD_INT interrupts Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 34/66] vhost: fix initialization for vq->is_le Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 35/66] regulator: axp20x: AXP806: Fix dcdcb being set instead of dcdce Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 36/66] percpu-refcount: fix reference leak during percpu-atomic transition Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 38/66] Revert "vring: Force use of DMA API for ARM-based systems with legacy devices" Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 39/66] pinctrl: baytrail: Debounce register is one per community Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 40/66] pinctrl: intel: merrifield: Add missed check in mrfld_config_set() Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 42/66] iwlwifi: mvm: avoid crash on restart w/o reserved queues Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 43/66] HID: usbhid: Quirk a AMI virtual mouse and keyboard with ALWAYS_POLL Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 44/66] HID: hid-lg: Fix immediate disconnection of Logitech Rumblepad 2 Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 45/66] HID: wacom: Fix poor prox handling in wacom_pl_irq Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 46/66] perf/x86/intel/uncore: Clean up hotplug conversion fallout Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 47/66] dmaengine: cppi41: Fix runtime PM timeouts with USB mass storage Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 48/66] dmaengine: cppi41: Fix oops in cppi41_runtime_resume Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 50/66] USB: serial: qcserial: add Dell DW5570 QDL Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 51/66] USB: serial: pl2303: add ATEN device ID Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 53/66] usb: musb: Fix host mode error -71 regression Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 54/66] usb: gadget: f_fs: Assorted buffer overflow checks Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 56/66] staging: greybus: timesync: validate platform state callback Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 57/66] iio: adc: palmas_gpadc: retrieve a valid iio_dev in suspend/resume Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 58/66] iio: health: afe4404: " Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 59/66] iio: health: afe4403: " Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 60/66] iio: dht11: Use usleep_range instead of msleep for start signal Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 61/66] iio: health: max30100: fixed parenthesis around FIFO count check Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 62/66] irqdomain: Avoid activating interrupts more than once Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 63/66] x86/irq: Make irq activate operations symmetric Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 64/66] iw_cxgb4: set correct FetchBurstMax for QPs Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 65/66] fs: break out of iomap_file_buffered_write on fatal signals Greg Kroah-Hartman
2017-02-07 12:59 ` [PATCH 4.9 66/66] drm/i915/execlists: Reset RING registers upon resume Greg Kroah-Hartman
2017-02-07 15:59 ` [PATCH 4.9 00/66] 4.9.9-stable review Shuah Khan
2017-02-07 16:15 ` Greg Kroah-Hartman
2017-02-07 21:44 ` Guenter Roeck
2017-02-08 6:35 ` Greg Kroah-Hartman
[not found] ` <589a65f4.0e821c0a.790cc.18db@mx.google.com>
2017-02-08 6:35 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170207124529.518813323@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=arbab@linux.vnet.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=toshi.kani@hpe.com \
--cc=zhenzhang.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).