From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Toshi Kani <toshi.kani@hpe.com>,
Zhang Zhen <zhenzhang.zhang@huawei.com>,
Reza Arbab <arbab@linux.vnet.ibm.com>,
David Rientjes <rientjes@google.com>,
Dan Williams <dan.j.williams@intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.4 29/29] base/memory, hotplug: fix a kernel oops in show_valid_zones()
Date: Tue, 7 Feb 2017 13:45:58 +0100 [thread overview]
Message-ID: <20170207124459.980853759@linuxfoundation.org> (raw)
In-Reply-To: <20170207124458.685292007@linuxfoundation.org>
4.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toshi Kani <toshi.kani@hpe.com>
commit a96dfddbcc04336bbed50dc2b24823e45e09e80c upstream.
Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
BUG: unable to handle kernel paging request at ffffea017a000000
IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.
[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org> [4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/base/memory.c | 11 +++++------
include/linux/memory_hotplug.h | 3 ++-
mm/memory_hotplug.c | 20 +++++++++++++++-----
3 files changed, 22 insertions(+), 12 deletions(-)
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -388,30 +388,29 @@ static ssize_t show_valid_zones(struct d
{
struct memory_block *mem = to_memory_block(dev);
unsigned long start_pfn, end_pfn;
+ unsigned long valid_start, valid_end;
unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
- struct page *first_page;
struct zone *zone;
start_pfn = section_nr_to_pfn(mem->start_section_nr);
end_pfn = start_pfn + nr_pages;
- first_page = pfn_to_page(start_pfn);
/* The block contains more than one zone can not be offlined. */
- if (!test_pages_in_a_zone(start_pfn, end_pfn))
+ if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
return sprintf(buf, "none\n");
- zone = page_zone(first_page);
+ zone = page_zone(pfn_to_page(valid_start));
if (zone_idx(zone) == ZONE_MOVABLE - 1) {
/*The mem block is the last memoryblock of this zone.*/
- if (end_pfn == zone_end_pfn(zone))
+ if (valid_end == zone_end_pfn(zone))
return sprintf(buf, "%s %s\n",
zone->name, (zone + 1)->name);
}
if (zone_idx(zone) == ZONE_MOVABLE) {
/*The mem block is the first memoryblock of ZONE_MOVABLE.*/
- if (start_pfn == zone->zone_start_pfn)
+ if (valid_start == zone->zone_start_pfn)
return sprintf(buf, "%s %s\n",
zone->name, (zone - 1)->name);
}
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -85,7 +85,8 @@ extern int zone_grow_waitqueues(struct z
extern int add_one_highpage(struct page *page, int pfn, int bad_ppro);
/* VM interface that may be used by firmware interface */
extern int online_pages(unsigned long, unsigned long, int);
-extern int test_pages_in_a_zone(unsigned long, unsigned long);
+extern int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
+ unsigned long *valid_start, unsigned long *valid_end);
extern void __offline_isolated_pages(unsigned long, unsigned long);
typedef void (*online_page_callback_t)(struct page *page);
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1372,10 +1372,13 @@ int is_mem_section_removable(unsigned lo
/*
* Confirm all pages in a range [start, end) belong to the same zone.
+ * When true, return its valid [start, end).
*/
-int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn)
+int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
+ unsigned long *valid_start, unsigned long *valid_end)
{
unsigned long pfn, sec_end_pfn;
+ unsigned long start, end;
struct zone *zone = NULL;
struct page *page;
int i;
@@ -1397,14 +1400,20 @@ int test_pages_in_a_zone(unsigned long s
page = pfn_to_page(pfn + i);
if (zone && page_zone(page) != zone)
return 0;
+ if (!zone)
+ start = pfn + i;
zone = page_zone(page);
+ end = pfn + MAX_ORDER_NR_PAGES;
}
}
- if (zone)
+ if (zone) {
+ *valid_start = start;
+ *valid_end = end;
return 1;
- else
+ } else {
return 0;
+ }
}
/*
@@ -1722,6 +1731,7 @@ static int __ref __offline_pages(unsigne
long offlined_pages;
int ret, drain, retry_max, node;
unsigned long flags;
+ unsigned long valid_start, valid_end;
struct zone *zone;
struct memory_notify arg;
@@ -1732,10 +1742,10 @@ static int __ref __offline_pages(unsigne
return -EINVAL;
/* This makes hotplug much easier...and readable.
we assume this for now. .*/
- if (!test_pages_in_a_zone(start_pfn, end_pfn))
+ if (!test_pages_in_a_zone(start_pfn, end_pfn, &valid_start, &valid_end))
return -EINVAL;
- zone = page_zone(pfn_to_page(start_pfn));
+ zone = page_zone(pfn_to_page(valid_start));
node = zone_to_nid(zone);
nr_pages = end_pfn - start_pfn;
next prev parent reply other threads:[~2017-02-07 12:47 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-07 12:45 [PATCH 4.4 00/29] 4.4.48-stable review Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 01/29] PCI/ASPM: Handle PCI-to-PCIe bridges as roots of PCIe hierarchies Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 02/29] ext4: validate s_first_meta_bg at mount time Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 03/29] drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215 Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 04/29] drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 05/29] crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 06/29] crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 07/29] perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 08/29] ata: sata_mv:- Handle return value of devm_ioremap Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 10/29] powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe() Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 11/29] powerpc: Add missing error check to prom_find_boot_cpu() Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 12/29] NFSD: Fix a null reference case in find_or_create_lock_stateid() Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 13/29] svcrpc: fix oops in absence of krb5 module Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 14/29] zswap: disable changing params if init fails Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 15/29] cifs: initialize file_info_lock Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 16/29] mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 17/29] mm, fs: check for fatal signals in do_generic_file_read() Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 18/29] can: bcm: fix hrtimer/tasklet termination in bcm op removal Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 19/29] mmc: sdhci: Ignore unexpected CARD_INT interrupts Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 20/29] percpu-refcount: fix reference leak during percpu-atomic transition Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 21/29] HID: wacom: Fix poor prox handling in wacom_pl_irq Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 23/29] USB: serial: qcserial: add Dell DW5570 QDL Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 24/29] USB: serial: pl2303: add ATEN device ID Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 26/29] usb: gadget: f_fs: Assorted buffer overflow checks Greg Kroah-Hartman
2017-02-07 12:45 ` [PATCH 4.4 28/29] x86/irq: Make irq activate operations symmetric Greg Kroah-Hartman
2017-02-07 12:45 ` Greg Kroah-Hartman [this message]
2017-02-07 15:59 ` [PATCH 4.4 00/29] 4.4.48-stable review Shuah Khan
2017-02-07 21:43 ` Guenter Roeck
[not found] ` <589a7404.48301c0a.b5123.22e3@mx.google.com>
2017-02-08 6:36 ` Greg Kroah-Hartman
2017-02-08 19:40 ` Kevin Hilman
2017-02-08 20:07 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170207124459.980853759@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=arbab@linux.vnet.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=toshi.kani@hpe.com \
--cc=zhenzhang.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).