From: lizf@kernel.org
To: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
Wanpeng Li <wanpeng.li@linux.intel.com>,
David Rientjes <rientjes@google.com>,
Prarit Bhargava <prarit@redhat.com>,
Steven Rostedt <srostedt@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Zefan Li <lizefan@huawei.com>
Subject: [PATCH 3.4 62/91] sched: Fix unreleased llc_shared_mask bit during CPU hotplug
Date: Thu, 27 Nov 2014 16:42:45 +0800 [thread overview]
Message-ID: <1417077794-9299-62-git-send-email-lizf@kernel.org> (raw)
In-Reply-To: <1417077368-9217-1-git-send-email-lizf@kernel.org>
From: Wanpeng Li <wanpeng.li@linux.intel.com>
3.4.105-rc1 review patch. If anyone has any objections, please let me know.
------------------
commit 03bd4e1f7265548832a76e7919a81f3137c44fd1 upstream.
The following bug can be triggered by hot adding and removing a large number of
xen domain0's vcpus repeatedly:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
PGD 5a9d5067 PUD 13067 PMD 0
Oops: 0000 [#3] SMP
[...]
Call Trace:
load_balance
? _raw_spin_unlock_irqrestore
idle_balance
__schedule
schedule
schedule_timeout
? lock_timer_base
schedule_timeout_uninterruptible
msleep
lock_device_hotplug_sysfs
online_store
dev_attr_store
sysfs_write_file
vfs_write
SyS_write
system_call_fastpath
Last level cache shared mask is built during CPU up and the
build_sched_domain() routine takes advantage of it to setup
the sched domain CPU topology.
However, llc_shared_mask is not released during CPU disable,
which leads to an invalid sched domainCPU topology.
This patch fix it by releasing the llc_shared_mask correctly
during CPU disable.
Yasuaki also reported that this can happen on real hardware:
https://lkml.org/lkml/2014/7/22/1018
His case is here:
==
Here is an example on my system.
My system has 4 sockets and each socket has 15 cores and HT is
enabled. In this case, each core of sockes is numbered as
follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-44, 90-104
Socket#3 | 45-59, 105-119
Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
It means that last level cache of Socket#2 is shared with
CPU#30-44 and 90-104.
When hot-removing socket#2 and #3, each core of sockets is
numbered as follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
remains having 0x3fff80000001fffc0000000.
After that, when hot-adding socket#2 and #3, each core of
sockets is numbered as follows:
| CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-59
Socket#3 | 90-119
Then llc_shared_mask of CPU#30 becomes
0x3fff8000fffffffc0000000. It means that last level cache of
Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
the wrong value.
Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Linn Crosetto <linn@hp.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
---
arch/x86/kernel/smpboot.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d28c595..c7dbf02 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1248,6 +1248,9 @@ static void remove_siblinginfo(int cpu)
for_each_cpu(sibling, cpu_sibling_mask(cpu))
cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling));
+ for_each_cpu(sibling, cpu_llc_shared_mask(cpu))
+ cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling));
+ cpumask_clear(cpu_llc_shared_mask(cpu));
cpumask_clear(cpu_sibling_mask(cpu));
cpumask_clear(cpu_core_mask(cpu));
c->phys_proc_id = 0;
--
1.9.1
next prev parent reply other threads:[~2014-11-27 8:49 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-27 8:36 [PATCH 3.4 00/91] 3.4.105-rc1 review lizf
2014-11-27 8:41 ` [PATCH 3.4 01/91] KVM: s390: Fix user triggerable bug in dead code lizf
2014-11-27 8:41 ` [PATCH 3.4 02/91] regmap: Fix handling of volatile registers for format_write() chips lizf
2014-11-27 8:41 ` [PATCH 3.4 03/91] drm/i915: Remove bogus __init annotation from DMI callbacks lizf
2014-11-27 8:41 ` [PATCH 3.4 04/91] get rid of propagate_umount() mistakenly treating slaves as busy lizf
2014-11-27 8:41 ` [PATCH 3.4 05/91] drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle lizf
2014-11-27 8:41 ` [PATCH 3.4 06/91] ALSA: hda - Fix COEF setups for ALC1150 codec lizf
2014-11-27 8:41 ` [PATCH 3.4 07/91] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock lizf
2014-11-27 8:41 ` [PATCH 3.4 08/91] regulatory: add NUL to alpha2 lizf
2014-11-27 8:41 ` [PATCH 3.4 09/91] percpu: fix pcpu_alloc_pages() failure path lizf
2014-11-27 8:41 ` [PATCH 3.4 10/91] percpu: perform tlb flush after pcpu_map_pages() failure lizf
2014-11-27 8:41 ` [PATCH 3.4 11/91] percpu: free percpu allocation info for uniprocessor system lizf
2014-11-27 8:41 ` [PATCH 3.4 12/91] cgroup: reject cgroup names with ' ' lizf
2014-11-27 8:41 ` [PATCH 3.4 13/91] rtlwifi: rtl8192cu: Add new ID lizf
2014-11-27 8:41 ` [PATCH 3.4 14/91] ahci: Add Device IDs for Intel 9 Series PCH lizf
2014-11-27 8:41 ` [PATCH 3.4 15/91] ata_piix: " lizf
2014-11-27 8:41 ` [PATCH 3.4 16/91] USB: ftdi_sio: add support for NOVITUS Bono E thermal printer lizf
2014-11-27 8:42 ` [PATCH 3.4 17/91] USB: sierra: avoid CDC class functions on "68A3" devices lizf
2014-11-27 8:42 ` [PATCH 3.4 18/91] USB: sierra: add 1199:68AA device ID lizf
2014-11-27 8:42 ` [PATCH 3.4 19/91] xen/manage: Always freeze/thaw processes when suspend/resuming lizf
2014-11-27 8:42 ` [PATCH 3.4 20/91] block: Fix dev_t minor allocation lifetime lizf
2014-11-27 8:42 ` [PATCH 3.4 21/91] usb: dwc3: core: fix order of PM runtime calls lizf
2014-11-27 8:42 ` [PATCH 3.4 22/91] ahci: add pcid for Marvel 0x9182 controller lizf
2014-11-27 8:42 ` [PATCH 3.4 23/91] drm/radeon: add connector quirk for fujitsu board lizf
2014-11-27 8:42 ` [PATCH 3.4 24/91] usb: host: xhci: fix compliance mode workaround lizf
2014-11-27 8:42 ` [PATCH 3.4 25/91] Input: elantech - fix detection of touchpad on ASUS s301l lizf
2014-11-27 8:42 ` [PATCH 3.4 26/91] USB: ftdi_sio: Add support for GE Healthcare Nemo Tracker device lizf
2014-11-27 8:42 ` [PATCH 3.4 27/91] uwb: init beacon cache entry before registering uwb device lizf
2014-11-27 8:42 ` [PATCH 3.4 28/91] Input: synaptics - add support for ForcePads lizf
2014-11-27 8:42 ` [PATCH 3.4 29/91] libceph: gracefully handle large reply messages from the mon lizf
2014-11-27 8:42 ` [PATCH 3.4 30/91] libceph: add process_one_ticket() helper lizf
2014-11-27 8:42 ` [PATCH 3.4 31/91] libceph: do not hard code max auth ticket len lizf
2014-11-27 8:42 ` [PATCH 3.4 32/91] Input: serport - add compat handling for SPIOCSTYPE ioctl lizf
2014-11-27 8:42 ` [PATCH 3.4 33/91] usb: hub: take hub->hdev reference when processing from eventlist lizf
2014-11-27 8:42 ` [PATCH 3.4 34/91] storage: Add single-LUN quirk for Jaz USB Adapter lizf
2014-11-27 8:42 ` [PATCH 3.4 35/91] xhci: Fix null pointer dereference if xhci initialization fails lizf
2014-11-27 8:42 ` [PATCH 3.4 36/91] futex: Unlock hb->lock in futex_wait_requeue_pi() error path lizf
2014-11-27 8:42 ` [PATCH 3.4 37/91] alarmtimer: Return relative times in timer_gettime lizf
2014-11-27 8:42 ` [PATCH 3.4 38/91] alarmtimer: Do not signal SIGEV_NONE timers lizf
2014-11-27 8:42 ` [PATCH 3.4 39/91] alarmtimer: Lock k_itimer during timer callback lizf
2014-11-27 8:42 ` [PATCH 3.4 40/91] don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() lizf
2014-11-27 8:42 ` [PATCH 3.4 41/91] jiffies: Fix timeval conversion to jiffies lizf
2014-11-27 8:42 ` [PATCH 3.4 42/91] MIPS: ZBOOT: add missing <linux/string.h> include lizf
2014-11-27 8:42 ` [PATCH 3.4 43/91] perf: Fix a race condition in perf_remove_from_context() lizf
2014-12-01 18:43 ` Sukadev Bhattiprolu
2014-12-02 1:22 ` Zefan Li
2014-11-27 8:42 ` [PATCH 3.4 44/91] ASoC: samsung-i2s: Check secondary DAI exists before referencing lizf
2014-11-27 8:42 ` [PATCH 3.4 45/91] Input: i8042 - add Fujitsu U574 to no_timeout dmi table lizf
2014-11-27 8:42 ` [PATCH 3.4 46/91] Input: i8042 - add nomux quirk for Avatar AVIU-145A6 lizf
2014-11-27 8:42 ` [PATCH 3.4 47/91] iscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid lizf
2014-11-27 8:42 ` [PATCH 3.4 48/91] iscsi-target: avoid NULL pointer in iscsi_copy_param_list failure lizf
2014-11-27 8:42 ` [PATCH 3.4 49/91] NFSv4: Fix another bug in the close/open_downgrade code lizf
2014-11-27 8:42 ` [PATCH 3.4 50/91] libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu lizf
2014-11-27 8:42 ` [PATCH 3.4 51/91] USB: storage: Add quirk for Adaptec USBConnect 2000 USB-to-SCSI Adapter lizf
2014-11-27 8:42 ` [PATCH 3.4 52/91] USB: storage: Add quirk for Ariston Technologies iConnect USB to SCSI adapter lizf
2014-11-27 8:42 ` [PATCH 3.4 53/91] USB: storage: Add quirks for Entrega/Xircom USB to SCSI converters lizf
2014-11-27 8:42 ` [PATCH 3.4 54/91] can: flexcan: mark TX mailbox as TX_INACTIVE lizf
2014-11-27 8:42 ` [PATCH 3.4 55/91] can: flexcan: correctly initialize mailboxes lizf
2014-11-27 8:42 ` [PATCH 3.4 56/91] can: flexcan: implement workaround for errata ERR005829 lizf
2014-11-27 8:42 ` [PATCH 3.4 57/91] can: flexcan: put TX mailbox into TX_INACTIVE mode after tx-complete lizf
2014-11-27 8:42 ` [PATCH 3.4 58/91] can: at91_can: add missing prepare and unprepare of the clock lizf
2014-11-27 8:42 ` [PATCH 3.4 59/91] ALSA: pcm: fix fifo_size frame calculation lizf
2014-11-27 8:42 ` [PATCH 3.4 60/91] Fix nasty 32-bit overflow bug in buffer i/o code lizf
2014-11-27 8:42 ` [PATCH 3.4 61/91] parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds lizf
2014-11-27 8:42 ` lizf [this message]
2014-11-27 8:42 ` [PATCH 3.4 63/91] sched: add macros to define bitops for task atomic flags lizf
2014-11-27 8:42 ` [PATCH 3.4 64/91] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be " lizf
2014-11-27 8:42 ` [PATCH 3.4 65/91] MIPS: mcount: Adjust stack pointer for static trace in MIPS32 lizf
2014-11-27 8:42 ` [PATCH 3.4 66/91] nilfs2: fix data loss with mmap() lizf
2014-11-27 8:42 ` [PATCH 3.4 67/91] ocfs2/dlm: do not get resource spinlock if lockres is new lizf
2014-11-27 8:42 ` [PATCH 3.4 68/91] shmem: fix nlink for rename overwrite directory lizf
2014-11-27 8:42 ` [PATCH 3.4 69/91] ARM: 8165/1: alignment: don't break misaligned NEON load/store lizf
2014-11-27 8:42 ` [PATCH 3.4 70/91] ASoC: core: fix possible ZERO_SIZE_PTR pointer dereferencing error lizf
2014-11-27 8:42 ` [PATCH 3.4 71/91] mm: migrate: Close race between migration completion and mprotect lizf
2014-11-27 8:42 ` [PATCH 3.4 72/91] perf: fix perf bug in fork() lizf
2014-11-27 8:42 ` [PATCH 3.4 73/91] init/Kconfig: Hide printk log config if CONFIG_PRINTK=n lizf
2014-11-27 8:42 ` [PATCH 3.4 74/91] genhd: fix leftover might_sleep() in blk_free_devt() lizf
2014-11-27 8:42 ` [PATCH 3.4 75/91] nl80211: clear skb cb before passing to netlink lizf
2014-11-27 8:42 ` [PATCH 3.4 76/91] ext4: propagate errors up to ext4_find_entry()'s callers lizf
2014-11-27 8:43 ` [PATCH 3.4 77/91] ext4: avoid trying to kfree an ERR_PTR pointer lizf
2014-11-27 8:43 ` [PATCH 3.4 78/91] NFS: fix stable regression lizf
2014-11-27 8:43 ` [PATCH 3.4 79/91] perf: Handle compat ioctl lizf
2014-11-27 8:43 ` [PATCH 3.4 80/91] bluetooth: hci_ldisc: fix deadlock condition lizf
2014-11-27 8:43 ` [PATCH 3.4 81/91] mnt: Only change user settable mount flags in remount lizf
2014-11-27 8:43 ` [PATCH 3.4 82/91] dm crypt: fix access beyond the end of allocated space lizf
2014-11-27 8:43 ` [PATCH 3.4 83/91] Fix spurious request sense in error handling lizf
2014-11-27 8:43 ` [PATCH 3.4 84/91] ipv4: move route garbage collector to work queue lizf
2014-11-27 8:43 ` [PATCH 3.4 85/91] ipv4: avoid parallel route cache gc executions lizf
2014-11-27 8:43 ` [PATCH 3.4 86/91] ipv4: disable bh while doing route gc lizf
2014-11-27 8:43 ` [PATCH 3.4 87/91] rtl8192ce: Fix null dereference in watchdog lizf
2014-11-27 8:43 ` [PATCH 3.4 88/91] ipv6: reuse ip6_frag_id from ip6_ufo_append_data lizf
2014-11-27 8:43 ` [PATCH 3.4 89/91] net: Do not enable tx-nocache-copy by default lizf
2014-11-27 8:43 ` [PATCH 3.4 90/91] ixgbevf: Prevent RX/TX statistics getting reset to zero lizf
2014-11-27 8:43 ` [PATCH 3.4 91/91] l2tp: fix race while getting PMTU on PPP pseudo-wire lizf
2014-11-27 9:07 ` [PATCH 3.4 00/91] 3.4.105-rc1 review Zefan Li
2014-11-27 20:15 ` Guenter Roeck
2014-11-29 5:54 ` Zefan Li
-- strict thread matches above, loose matches on Subject: below --
2015-01-28 4:07 [PATCH 3.4 000/177] 3.4.106-rc1 review lizf
2015-01-28 4:08 ` [PATCH 3.4 62/91] sched: Fix unreleased llc_shared_mask bit during CPU hotplug lizf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1417077794-9299-62-git-send-email-lizf@kernel.org \
--to=lizf@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=prarit@redhat.com \
--cc=rientjes@google.com \
--cc=srostedt@redhat.com \
--cc=stable@vger.kernel.org \
--cc=wanpeng.li@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.