From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Gu Zheng <guz.fnst@cn.fujitsu.com>,
Xishi Qiu <qiuxishi@huawei.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
David Rientjes <rientjes@google.com>,
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>,
Taku Izumi <izumi.taku@jp.fujitsu.com>,
Tang Chen <tangchen@cn.fujitsu.com>,
Xie XiuQi <xiexiuqi@huawei.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.19 044/101] mm/memory hotplug: postpone the reset of obsolete pgdat
Date: Fri, 17 Apr 2015 15:28:32 +0200 [thread overview]
Message-ID: <20150417132516.314598681@linuxfoundation.org> (raw)
In-Reply-To: <20150417132514.379828774@linuxfoundation.org>
3.19-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gu Zheng <guz.fnst@cn.fujitsu.com>
commit b0dc3a342af36f95a68fe229b8f0f73552c5ca08 upstream.
Qiu Xishi reported the following BUG when testing hot-add/hot-remove node under
stress condition:
BUG: unable to handle kernel paging request at 0000000000025f60
IP: next_online_pgdat+0x1/0x50
PGD 0
Oops: 0000 [#1] SMP
ACPI: Device does not support D3cold
Modules linked in: fuse nls_iso8859_1 nls_cp437 vfat fat loop dm_mod coretemp mperf crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 pcspkr microcode igb dca i2c_algo_bit ipv6 megaraid_sas iTCO_wdt i2c_i801 i2c_core iTCO_vendor_support tg3 sg hwmon ptp lpc_ich pps_core mfd_core acpi_pad rtc_cmos button ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_alua scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh ahci libahci libata scsi_mod [last unloaded: rasf]
CPU: 23 PID: 238 Comm: kworker/23:1 Tainted: G O 3.10.15-5885-euler0302 #1
Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei N1, BIOS V100R001 03/02/2015
Workqueue: events vmstat_update
task: ffffa800d32c0000 ti: ffffa800d32ae000 task.ti: ffffa800d32ae000
RIP: 0010: next_online_pgdat+0x1/0x50
RSP: 0018:ffffa800d32afce8 EFLAGS: 00010286
RAX: 0000000000001440 RBX: ffffffff81da53b8 RCX: 0000000000000082
RDX: 0000000000000000 RSI: 0000000000000082 RDI: 0000000000000000
RBP: ffffa800d32afd28 R08: ffffffff81c93bfc R09: ffffffff81cbdc96
R10: 00000000000040ec R11: 00000000000000a0 R12: ffffa800fffb3440
R13: ffffa800d32afd38 R14: 0000000000000017 R15: ffffa800e6616800
FS: 0000000000000000(0000) GS:ffffa800e6600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000025f60 CR3: 0000000001a0b000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
refresh_cpu_vm_stats+0xd0/0x140
vmstat_update+0x11/0x50
process_one_work+0x194/0x3d0
worker_thread+0x12b/0x410
kthread+0xc6/0xd0
ret_from_fork+0x7c/0xb0
The cause is the "memset(pgdat, 0, sizeof(*pgdat))" at the end of
try_offline_node, which will reset all the content of pgdat to 0, as the
pgdat is accessed lock-free, so that the users still using the pgdat
will panic, such as the vmstat_update routine.
process A: offline node XX:
vmstat_updat()
refresh_cpu_vm_stats()
for_each_populated_zone()
find online node XX
cond_resched()
offline cpu and memory, then try_offline_node()
node_set_offline(nid), and memset(pgdat, 0, sizeof(*pgdat))
zone = next_zone(zone)
pg_data_t *pgdat = zone->zone_pgdat; // here pgdat is NULL now
next_online_pgdat(pgdat)
next_online_node(pgdat->node_id); // NULL pointer access
So the solution here is postponing the reset of obsolete pgdat from
try_offline_node() to hotadd_new_pgdat(), and just resetting
pgdat->nr_zones and pgdat->classzone_idx to be 0 rather than the memset
0 to avoid breaking pointer information in pgdat.
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Xishi Qiu <qiuxishi@huawei.com>
Suggested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/memory_hotplug.c | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1092,6 +1092,10 @@ static pg_data_t __ref *hotadd_new_pgdat
return NULL;
arch_refresh_nodedata(nid, pgdat);
+ } else {
+ /* Reset the nr_zones and classzone_idx to 0 before reuse */
+ pgdat->nr_zones = 0;
+ pgdat->classzone_idx = 0;
}
/* we can use NODE_DATA(nid) from here */
@@ -1977,15 +1981,6 @@ void try_offline_node(int nid)
if (is_vmalloc_addr(zone->wait_table))
vfree(zone->wait_table);
}
-
- /*
- * Since there is no way to guarentee the address of pgdat/zone is not
- * on stack of any kernel threads or used by other kernel objects
- * without reference counting or other symchronizing method, do not
- * reset node_data and free pgdat here. Just reset it to 0 and reuse
- * the memory when the node is online again.
- */
- memset(pgdat, 0, sizeof(*pgdat));
}
EXPORT_SYMBOL(try_offline_node);
next prev parent reply other threads:[~2015-04-17 14:02 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-04-17 13:27 [PATCH 3.19 000/101] 3.19.5-stable review Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 001/101] ALSA: hda - Add dock support for Thinkpad T450s (17aa:5036) Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 002/101] ALSA: hda - Add one more node in the EAPD supporting candidate list Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 003/101] ALSA: usb - Creative USB X-Fi Pro SB1095 volume knob support Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 004/101] ALSA: bebob: fix to processing in big-endian machine for sending cue Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 005/101] ALSA: hda/realtek - Make more stable to get pin sense for ALC283 Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 006/101] ALSA: hda - Fix headphone pin config for Lifebook T731 Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 007/101] PCI/AER: Avoid info leak in __print_tlp_header() Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 008/101] PCI: cpcihp: Add missing curly braces in cpci_configure_slot() Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 009/101] Revert "sparc/PCI: Clip bridge windows to fit in upstream windows" Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 010/101] PCI: Dont look for ACPI hotplug parameters if ACPI is disabled Greg Kroah-Hartman
2015-04-17 13:27 ` [PATCH 3.19 011/101] PCI: spear: Drop __initdata from spear13xx_pcie_driver Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 012/101] ARC: SA_SIGINFO ucontext regs off-by-one Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 013/101] ARC: signal handling robustify Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 014/101] ARM: sunxi: Have ARCH_SUNXI select RESET_CONTROLLER for clock driver usage Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 015/101] selinux: fix sel_write_enforce broken return value Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 016/101] blk-mq: fix use of incorrect goto label in blk_mq_init_queue error path Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 017/101] blkmq: Fix NULL pointer deref when all reserved tags in Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 018/101] Fix bug in blk_rq_merge_ok Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 019/101] block: fix blk_stack_limits() regression due to lcm() change Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 020/101] drm: Fixup racy refcounting in plane_force_disable Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 021/101] drm/edid: set ELD for firmware and debugfs override EDIDs Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 023/101] drm/radeon/dpm: fix 120hz handling harder Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 027/101] drm/i915/vlv: save/restore the power context base reg Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 028/101] drm/i915/vlv: remove wait for previous GFX clk disable request Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 030/101] drm/i915: Align initial plane backing objects correctly Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 031/101] btrfs: simplify insert_orphan_item Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 032/101] IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 033/101] iwlwifi: dvm: run INIT firmware again upon .start() Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 034/101] x86/xen: prepare p2m list for memory hotplug Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 035/101] xen/balloon: before adding hotplugged memory, set frames to invalid Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 036/101] nfsd: return correct openowner when there is a race to put one in the hash Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 037/101] nfsd: return correct lockowner when there is a race on hash insert Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 038/101] sunrpc: make debugfs file creation failure non-fatal Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 039/101] powerpc: fix memory corruption by pnv_alloc_idle_core_states Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 040/101] powerpc: Re-enable dynticks Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 041/101] nbd: fix possible memory leak Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 042/101] mac80211: fix RX A-MPDU session reorder timer deletion Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 043/101] mm: fix anon_vma->degree underflow in anon_vma endless growing prevention Greg Kroah-Hartman
2015-04-17 13:28 ` Greg Kroah-Hartman [this message]
2015-04-17 13:28 ` [PATCH 3.19 045/101] mm/page_alloc.c: call kernel_map_pages in unset_migrateype_isolate Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 046/101] sched: Fix RLIMIT_RTTIME when PI-boosting to RT Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 047/101] cpufreq: Schedule work for the first-online CPU on resume Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 048/101] writeback: add missing INITIAL_JIFFIES init in global_update_bandwidth() Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 049/101] writeback: fix possible underflow in write bandwidth calculation Greg Kroah-Hartman
2015-04-17 13:28 ` Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 050/101] libata: Update Crucial/Micron blacklist Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 051/101] libata: Blacklist queued TRIM on Samsung SSD 850 Pro Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 053/101] USB: keyspan_pda: add new device id Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 054/101] USB: ftdi_sio: Added custom PID for Synapse Wireless product Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 055/101] USB: ftdi_sio: Use jtag quirk for SNAP Connect E10 Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 056/101] Defer processing of REQ_PREEMPT requests for blocked devices Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 057/101] iio: inv_mpu6050: Clear timestamps fifo while resetting hardware fifo Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 058/101] iio: core: Fix double free Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 059/101] iio: bmc150: change sampling frequency Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 060/101] iio: adc: vf610: use ADC clock within specification Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 061/101] iio: imu: Use iio_trigger_get for indio_dev->trig assignment Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 062/101] dmaengine: edma: fix memory leak when terminating running transfers Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 063/101] dmaengine: omap-dma: Fix memory leak when terminating running transfer Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 064/101] ath9k: fix tracking of enabled AP beacons Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 065/101] x86/reboot: Add ASRock Q1900DC-ITX mainboard reboot quirk Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 066/101] can: flexcan: fix bus-off error state handling Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 067/101] can: flexcan: Deferred on Regulator return EPROBE_DEFER Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 068/101] firmware: dmi_scan: Prevent dmi_num integer overflow Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 069/101] cpuidle: remove state_count field from struct cpuidle_device Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 070/101] cpuidle: ACPI: do not overwrite name and description of C0 Greg Kroah-Hartman
2015-04-17 13:28 ` [PATCH 3.19 071/101] usb: xhci: handle Config Error Change (CEC) in xhci driver Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 072/101] usb: xhci: apply XHCI_AVOID_BEI quirk to all Intel xHCI controllers Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 073/101] tty: serial: fsl_lpuart: specify transmit FIFO size Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 074/101] tty: serial: fsl_lpuart: clear receive flag on FIFO flush Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 075/101] n_tty: Fix read buffer overwrite when no newline Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 076/101] cifs: smb2_clone_range() - exit on unhandled error Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 077/101] cifs: fix use-after-free bug in find_writable_file Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 078/101] brcmfmac: disable MBSS feature for BCM43362 Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 079/101] iommu/vt-d: Detach domain *only* from attached iommus Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 080/101] rtlwifi: Fix IOMMU mapping leak in AP mode Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 081/101] drivers/of: Add empty ranges quirk for PA-Semi Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 082/101] Revert "PM / hibernate: avoid unsafe pages in e820 reserved regions" Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 083/101] Revert "libceph: use memalloc flags for net IO" Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 084/101] be2iscsi: Fix kernel panic when device initialization fails Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 085/101] ocfs2: _really_ sync the right range Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 086/101] ioctx_alloc(): fix vma (and file) leak on failure Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 087/101] iscsi target: fix oops when adding reject pdu Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 088/101] [media] sh_veu: v4l2_dev wasnt set Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 089/101] [media] media: s5p-mfc: fix mmap support for 64bit arch Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 090/101] [media] cx23885: fix querycap Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 091/101] [media] soc-camera: Fix devm_kfree() in soc_of_bind() Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 092/101] [media] vb2: Fix dma_dir setting for dma-contig mem type Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 093/101] [media] vb2: fix UNBALANCED warnings when calling vb2_thread_stop() Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 096/101] IB/mlx4: Saturate RoCE port PMA counters in case of overflow Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 097/101] timers/tick/broadcast-hrtimer: Fix suspicious RCU usage in idle loop Greg Kroah-Hartman
2015-04-17 13:29 ` Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 098/101] ext4: fix indirect punch hole corruption Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 099/101] xfs: ensure truncate forces zeroed blocks to disk Greg Kroah-Hartman
2015-04-17 13:29 ` [PATCH 3.19 101/101] kvm: avoid page allocation failure in kvm_set_memory_region() Greg Kroah-Hartman
2015-04-17 17:34 ` [PATCH 3.19 000/101] 3.19.5-stable review Shuah Khan
2015-04-17 19:43 ` Greg Kroah-Hartman
2015-04-17 20:03 ` Guenter Roeck
2015-04-18 18:59 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150417132516.314598681@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=guz.fnst@cn.fujitsu.com \
--cc=isimatu.yasuaki@jp.fujitsu.com \
--cc=izumi.taku@jp.fujitsu.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=qiuxishi@huawei.com \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=tangchen@cn.fujitsu.com \
--cc=torvalds@linux-foundation.org \
--cc=xiexiuqi@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.