From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: John Sperbeck <jsperbeck@google.com>,
Dennis Zhou <dennis@kernel.org>, Sasha Levin <sashal@kernel.org>,
linux-mm@kvack.org, netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 5.1 057/186] percpu: remove spurious lock dependency between percpu and sched
Date: Sat, 1 Jun 2019 09:14:33 -0400 [thread overview]
Message-ID: <20190601131653.24205-57-sashal@kernel.org> (raw)
In-Reply-To: <20190601131653.24205-1-sashal@kernel.org>
From: John Sperbeck <jsperbeck@google.com>
[ Upstream commit 198790d9a3aeaef5792d33a560020861126edc22 ]
In free_percpu() we sometimes call pcpu_schedule_balance_work() to
queue a work item (which does a wakeup) while holding pcpu_lock.
This creates an unnecessary lock dependency between pcpu_lock and
the scheduler's pi_lock. There are other places where we call
pcpu_schedule_balance_work() without hold pcpu_lock, and this case
doesn't need to be different.
Moving the call outside the lock prevents the following lockdep splat
when running tools/testing/selftests/bpf/{test_maps,test_progs} in
sequence with lockdep enabled:
======================================================
WARNING: possible circular locking dependency detected
5.1.0-dbg-DEV #1 Not tainted
------------------------------------------------------
kworker/23:255/18872 is trying to acquire lock:
000000000bc79290 (&(&pool->lock)->rlock){-.-.}, at: __queue_work+0xb2/0x520
but task is already holding lock:
00000000e3e7a6aa (pcpu_lock){..-.}, at: free_percpu+0x36/0x260
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (pcpu_lock){..-.}:
lock_acquire+0x9e/0x180
_raw_spin_lock_irqsave+0x3a/0x50
pcpu_alloc+0xfa/0x780
__alloc_percpu_gfp+0x12/0x20
alloc_htab_elem+0x184/0x2b0
__htab_percpu_map_update_elem+0x252/0x290
bpf_percpu_hash_update+0x7c/0x130
__do_sys_bpf+0x1912/0x1be0
__x64_sys_bpf+0x1a/0x20
do_syscall_64+0x59/0x400
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #3 (&htab->buckets[i].lock){....}:
lock_acquire+0x9e/0x180
_raw_spin_lock_irqsave+0x3a/0x50
htab_map_update_elem+0x1af/0x3a0
-> #2 (&rq->lock){-.-.}:
lock_acquire+0x9e/0x180
_raw_spin_lock+0x2f/0x40
task_fork_fair+0x37/0x160
sched_fork+0x211/0x310
copy_process.part.43+0x7b1/0x2160
_do_fork+0xda/0x6b0
kernel_thread+0x29/0x30
rest_init+0x22/0x260
arch_call_rest_init+0xe/0x10
start_kernel+0x4fd/0x520
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0x6f/0x72
secondary_startup_64+0xa4/0xb0
-> #1 (&p->pi_lock){-.-.}:
lock_acquire+0x9e/0x180
_raw_spin_lock_irqsave+0x3a/0x50
try_to_wake_up+0x41/0x600
wake_up_process+0x15/0x20
create_worker+0x16b/0x1e0
workqueue_init+0x279/0x2ee
kernel_init_freeable+0xf7/0x288
kernel_init+0xf/0x180
ret_from_fork+0x24/0x30
-> #0 (&(&pool->lock)->rlock){-.-.}:
__lock_acquire+0x101f/0x12a0
lock_acquire+0x9e/0x180
_raw_spin_lock+0x2f/0x40
__queue_work+0xb2/0x520
queue_work_on+0x38/0x80
free_percpu+0x221/0x260
pcpu_freelist_destroy+0x11/0x20
stack_map_free+0x2a/0x40
bpf_map_free_deferred+0x3c/0x50
process_one_work+0x1f7/0x580
worker_thread+0x54/0x410
kthread+0x10f/0x150
ret_from_fork+0x24/0x30
other info that might help us debug this:
Chain exists of:
&(&pool->lock)->rlock --> &htab->buckets[i].lock --> pcpu_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(pcpu_lock);
lock(&htab->buckets[i].lock);
lock(pcpu_lock);
lock(&(&pool->lock)->rlock);
*** DEADLOCK ***
3 locks held by kworker/23:255/18872:
#0: 00000000b36a6e16 ((wq_completion)events){+.+.},
at: process_one_work+0x17a/0x580
#1: 00000000dfd966f0 ((work_completion)(&map->work)){+.+.},
at: process_one_work+0x17a/0x580
#2: 00000000e3e7a6aa (pcpu_lock){..-.},
at: free_percpu+0x36/0x260
stack backtrace:
CPU: 23 PID: 18872 Comm: kworker/23:255 Not tainted 5.1.0-dbg-DEV #1
Hardware name: ...
Workqueue: events bpf_map_free_deferred
Call Trace:
dump_stack+0x67/0x95
print_circular_bug.isra.38+0x1c6/0x220
check_prev_add.constprop.50+0x9f6/0xd20
__lock_acquire+0x101f/0x12a0
lock_acquire+0x9e/0x180
_raw_spin_lock+0x2f/0x40
__queue_work+0xb2/0x520
queue_work_on+0x38/0x80
free_percpu+0x221/0x260
pcpu_freelist_destroy+0x11/0x20
stack_map_free+0x2a/0x40
bpf_map_free_deferred+0x3c/0x50
process_one_work+0x1f7/0x580
worker_thread+0x54/0x410
kthread+0x10f/0x150
ret_from_fork+0x24/0x30
Signed-off-by: John Sperbeck <jsperbeck@google.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
mm/percpu.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/percpu.c b/mm/percpu.c
index 68dd2e7e73b5f..d832793bf83a1 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1738,6 +1738,7 @@ void free_percpu(void __percpu *ptr)
struct pcpu_chunk *chunk;
unsigned long flags;
int off;
+ bool need_balance = false;
if (!ptr)
return;
@@ -1759,7 +1760,7 @@ void free_percpu(void __percpu *ptr)
list_for_each_entry(pos, &pcpu_slot[pcpu_nr_slots - 1], list)
if (pos != chunk) {
- pcpu_schedule_balance_work();
+ need_balance = true;
break;
}
}
@@ -1767,6 +1768,9 @@ void free_percpu(void __percpu *ptr)
trace_percpu_free_percpu(chunk->base_addr, off, ptr);
spin_unlock_irqrestore(&pcpu_lock, flags);
+
+ if (need_balance)
+ pcpu_schedule_balance_work();
}
EXPORT_SYMBOL_GPL(free_percpu);
--
2.20.1
next prev parent reply other threads:[~2019-06-01 13:18 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-06-01 13:13 [PATCH AUTOSEL 5.1 001/186] media: rockchip/vpu: Fix/re-order probe-error/remove path Sasha Levin
2019-06-01 13:13 ` Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 002/186] media: rockchip/vpu: Add missing dont_use_autosuspend() calls Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 003/186] rapidio: fix a NULL pointer dereference when create_workqueue() fails Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 004/186] fs/fat/file.c: issue flush after the writeback of FAT Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 005/186] sysctl: return -EINVAL if val violates minmax Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 006/186] ipc: prevent lockup on alloc_msg and free_msg Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 007/186] drm/msm: correct attempted NULL pointer dereference in debugfs Sasha Levin
2019-06-01 13:13 ` Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 008/186] drm/pl111: Initialize clock spinlock early Sasha Levin
2019-06-01 13:13 ` Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 009/186] mm/mprotect.c: fix compilation warning because of unused 'mm' variable Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 010/186] ARM: prevent tracing IPI_CPU_BACKTRACE Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 011/186] mm/hmm: select mmu notifier when selecting HMM Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 012/186] hugetlbfs: on restore reserve error path retain subpool reservation Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 013/186] mm/memory_hotplug: release memory resource after arch_remove_memory() Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 014/186] mem-hotplug: fix node spanned pages when we have a node with only ZONE_MOVABLE Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 015/186] mm/cma.c: fix crash on CMA allocation if bitmap allocation fails Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 016/186] initramfs: free initrd memory if opening /initrd.image fails Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 017/186] mm/compaction.c: fix an undefined behaviour Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 018/186] mm/memory_hotplug.c: fix the wrong usage of N_HIGH_MEMORY Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 019/186] mm/cma.c: fix the bitmap status to show failed allocation reason Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 020/186] mm: page_mkclean vs MADV_DONTNEED race Sasha Levin
2019-06-01 13:13 ` Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 021/186] mm/cma_debug.c: fix the break condition in cma_maxchunk_get() Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 022/186] mm/slab.c: fix an infinite loop in leaks_show() Sasha Levin
2019-06-01 13:13 ` [PATCH AUTOSEL 5.1 023/186] kernel/sys.c: prctl: fix false positive in validate_prctl_map() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 024/186] thermal: rcar_gen3_thermal: disable interrupt in .remove Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 025/186] drivers: thermal: tsens: Don't print error message on -EPROBE_DEFER Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 026/186] mfd: tps65912-spi: Add missing of table registration Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 027/186] mfd: intel-lpss: Set the device in reset state when init Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 028/186] drm/nouveau/disp/dp: respect sink limits when selecting failsafe link configuration Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 029/186] mfd: twl6040: Fix device init errors for ACCCTL register Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 030/186] perf/x86/intel: Allow PEBS multi-entry in watermark mode Sasha Levin
[not found] ` <20190601131653.24205-1-sashal-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 031/186] drm/nouveau/kms/gf119-gp10x: push HeadSetControlOutputResource() mthd when encoders change Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 035/186] drm/nouveau/kms/gv100-: fix spurious window immediate interlocks Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 032/186] drm/nouveau: fix duplication of nv50_head_atom struct Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 033/186] drm/bridge: adv7511: Fix low refresh rate selection Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 034/186] objtool: Don't use ignore flag for fake jumps Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 036/186] bpf: fix undefined behavior in narrow load handling Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 037/186] gcc-plugins: arm_ssp_per_task_plugin: Fix for older GCC < 6 Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 038/186] EDAC/mpc85xx: Prevent building as a module Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 039/186] NFS4: Fix v4.0 client state corruption when mount Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 040/186] pwm: meson: Use the spin-lock only to protect register modifications Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 041/186] mailbox: stm32-ipcc: check invalid irq Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 042/186] ntp: Allow TAI-UTC offset to be set to zero Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 043/186] f2fs: fix to avoid panic in do_recover_data() Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 044/186] f2fs: fix to avoid panic in f2fs_inplace_write_data() Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 045/186] f2fs: fix error path of recovery Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 046/186] f2fs: fix to avoid panic in f2fs_remove_inode_page() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 047/186] f2fs: fix to do sanity check on free nid Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 048/186] f2fs: fix to clear dirty inode in error path of f2fs_iget() Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 049/186] f2fs: fix to avoid panic in dec_valid_block_count() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 050/186] f2fs: fix to use inline space only if inline_xattr is enable Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 051/186] f2fs: fix to avoid panic in dec_valid_node_count() Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 052/186] f2fs: fix to do sanity check on valid block count of segment Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 053/186] f2fs: fix to avoid deadloop in foreground GC Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 054/186] f2fs: fix to retrieve inline xattr space Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 055/186] f2fs: fix to do checksum even if inode page is uptodate Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 056/186] media: atmel: atmel-isc: fix asd memory allocation Sasha Levin
2019-06-01 13:14 ` Sasha Levin [this message]
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 058/186] tracing: probeevent: Fix to make the type of $comm string Sasha Levin
2019-06-08 21:31 ` Steven Rostedt
2019-06-09 19:13 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 059/186] tracing: Fix partial reading of trace event's id file Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 060/186] configfs: fix possible use-after-free in configfs_register_group Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 061/186] uml: fix a boot splat wrt use of cpu_all_mask Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 062/186] cifs: fix credits leak for SMB1 oplock breaks Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 063/186] PCI: dwc: Free MSI in dw_pcie_host_init() error path Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 064/186] PCI: dwc: Free MSI IRQ page in dw_pcie_free_msi() Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 065/186] fbcon: Don't reset logo_shown when logo is currently shown Sasha Levin
2019-06-01 13:14 ` Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 066/186] netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 067/186] ovl: do not generate duplicate fsnotify events for "fake" path Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 068/186] mmc: mmci: Prevent polling for busy detection in IRQ context Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 069/186] netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 070/186] netfilter: nf_conntrack_h323: restore boundary check correctness Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 071/186] mips: Make sure dt memory regions are valid Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 072/186] netfilter: nf_tables: fix base chain stat rcu_dereference usage Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 073/186] watchdog: Use depends instead of select for pretimeout governors Sasha Levin
2019-06-01 13:14 ` [PATCH AUTOSEL 5.1 074/186] watchdog: imx2_wdt: Fix set_timeout for big timeout values Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190601131653.24205-57-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=dennis@kernel.org \
--cc=jsperbeck@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.