public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Douglas Anderson <dianders@chromium.org>,
	Daniel Thompson <daniel.thompson@linaro.org>,
	Sasha Levin <sashal@kernel.org>,
	kgdb-bugreport@lists.sourceforge.net
Subject: [PATCH AUTOSEL 5.7 52/53] kgdb: Avoid suspicious RCU usage warning
Date: Wed,  1 Jul 2020 21:22:01 -0400	[thread overview]
Message-ID: <20200702012202.2700645-52-sashal@kernel.org> (raw)
In-Reply-To: <20200702012202.2700645-1-sashal@kernel.org>

From: Douglas Anderson <dianders@chromium.org>

[ Upstream commit 440ab9e10e2e6e5fd677473ee6f9e3af0f6904d6 ]

At times when I'm using kgdb I see a splat on my console about
suspicious RCU usage.  I managed to come up with a case that could
reproduce this that looked like this:

  WARNING: suspicious RCU usage
  5.7.0-rc4+ #609 Not tainted
  -----------------------------
  kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection!

  other info that might help us debug this:

    rcu_scheduler_active = 2, debug_locks = 1
  3 locks held by swapper/0/1:
   #0: ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c
   #1: ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac
   #2: ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac

  stack backtrace:
  CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #609
  Hardware name: Google Cheza (rev3+) (DT)
  Call trace:
   dump_backtrace+0x0/0x1b8
   show_stack+0x1c/0x24
   dump_stack+0xd4/0x134
   lockdep_rcu_suspicious+0xf0/0x100
   find_task_by_pid_ns+0x5c/0x80
   getthread+0x8c/0xb0
   gdb_serial_stub+0x9d4/0xd04
   kgdb_cpu_enter+0x284/0x7ac
   kgdb_handle_exception+0x174/0x20c
   kgdb_brk_fn+0x24/0x30
   call_break_hook+0x6c/0x7c
   brk_handler+0x20/0x5c
   do_debug_exception+0x1c8/0x22c
   el1_sync_handler+0x3c/0xe4
   el1_sync+0x7c/0x100
   rpmh_rsc_probe+0x38/0x420
   platform_drv_probe+0x94/0xb4
   really_probe+0x134/0x300
   driver_probe_device+0x68/0x100
   __device_attach_driver+0x90/0xa8
   bus_for_each_drv+0x84/0xcc
   __device_attach+0xb4/0x13c
   device_initial_probe+0x18/0x20
   bus_probe_device+0x38/0x98
   device_add+0x38c/0x420

If I understand properly we should just be able to blanket kgdb under
one big RCU read lock and the problem should go away.  We'll add it to
the beast-of-a-function known as kgdb_cpu_enter().

With this I no longer get any splats and things seem to work fine.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/debug/debug_core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index d47c7d6656cd3..9be6accf8fe3d 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -577,6 +577,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
 		arch_kgdb_ops.disable_hw_break(regs);
 
 acquirelock:
+	rcu_read_lock();
 	/*
 	 * Interrupts will be restored by the 'trap return' code, except when
 	 * single stepping.
@@ -636,6 +637,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
 			atomic_dec(&slaves_in_kgdb);
 			dbg_touch_watchdogs();
 			local_irq_restore(flags);
+			rcu_read_unlock();
 			return 0;
 		}
 		cpu_relax();
@@ -654,6 +656,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
 		raw_spin_unlock(&dbg_master_lock);
 		dbg_touch_watchdogs();
 		local_irq_restore(flags);
+		rcu_read_unlock();
 
 		goto acquirelock;
 	}
@@ -777,6 +780,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs,
 	raw_spin_unlock(&dbg_master_lock);
 	dbg_touch_watchdogs();
 	local_irq_restore(flags);
+	rcu_read_unlock();
 
 	return kgdb_info[cpu].ret_state;
 }
-- 
2.25.1


  parent reply	other threads:[~2020-07-02  1:24 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-02  1:21 [PATCH AUTOSEL 5.7 01/53] soc: ti: omap-prm: use atomic iopoll instead of sleeping one Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 02/53] regmap: fix alignment issue Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 03/53] perf/x86/rapl: Move RAPL support to common x86 code Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 04/53] perf/x86/rapl: Fix RAPL config variable bug Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 05/53] ARM: dts: omap4-droid4: Fix spi configuration and increase rate Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 06/53] drm/ttm: Fix dma_fence refcnt leak in ttm_bo_vm_fault_reserved Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 07/53] drm/ttm: Fix dma_fence refcnt leak when adding move fence Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 08/53] gpu: host1x: Clean up debugfs in error handling path Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 09/53] drm/tegra: hub: Do not enable orphaned window group Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 10/53] gpu: host1x: Detach driver on unregister Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 11/53] btrfs: use kfree() in btrfs_ioctl_get_subvol_info() Sasha Levin
2020-07-02  8:25   ` David Sterba
2020-07-09 22:28     ` Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 12/53] staging: wfx: fix coherency of hif_scan() prototype Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 13/53] drm: mcde: Fix display initialization problem Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 14/53] ASoC: SOF: Intel: add PCI ID for CometLake-S Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 15/53] ASoC: SOF: Intel: add PCI IDs for ICL-H and TGL-H Sasha Levin
2020-07-02 11:18   ` Mark Brown
2020-07-02 15:42     ` Pierre-Louis Bossart
2020-07-02 16:05       ` Mark Brown
2020-07-09 22:29         ` Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 16/53] ASoC: hdac_hda: fix memleak with regmap not freed on remove Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 17/53] net: usb: ax88179_178a: fix packet alignment padding Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 18/53] ALSA: hda: Intel: add missing PCI IDs for ICL-H, TGL-H and EKL Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 19/53] usb: usbtest: fix missing kfree(dev->buf) in usbtest_disconnect Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 20/53] spi: spidev: fix a race between spidev_release and spidev_remove Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 21/53] spi: spidev: fix a potential use-after-free in spidev_release() Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 22/53] net: ethernet: mvneta: Fix Serdes configuration for SoCs without comphy Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 23/53] net: ethernet: mvneta: Add 2500BaseX support " Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 24/53] tg3: driver sleeps indefinitely when EEH errors exceed eeh_max_freezes Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 25/53] ixgbe: protect ring accesses with READ- and WRITE_ONCE Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 26/53] i40e: " Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 27/53] ice: protect ring accesses with WRITE_ONCE Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 28/53] ibmvnic: continue to init in CRQ reset returns H_CLOSED Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 29/53] powerpc/kvm/book3s64: Fix kernel crash with nested kvm & DEBUG_VIRTUAL Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 30/53] xprtrdma: Prevent dereferencing r_xprt->rx_ep after it is freed Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 31/53] usbnet: smsc95xx: Fix use-after-free after removal Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 32/53] iommu/vt-d: Don't apply gfx quirks to untrusted devices Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 33/53] drm: panel-orientation-quirks: Add quirk for Asus T101HA panel Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 34/53] drm: panel-orientation-quirks: Use generic orientation-data for Acer S1003 Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 35/53] s390/kasan: fix early pgm check handler execution Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 36/53] s390/debug: avoid kernel warning on too large number of pages Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 37/53] cifs: Fix double add page to memcg when cifs_readpages Sasha Levin
2020-07-02 16:08   ` Pavel Shilovsky
2020-07-09 23:49     ` Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 38/53] drm/sun4i: mixer: Call of_dma_configure if there's an IOMMU Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 39/53] io_uring: fix io_sq_thread no schedule when busy Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 40/53] cifs: update ctime and mtime during truncate Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 41/53] ARM: imx6: add missing put_device() call in imx6q_suspend_init() Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 42/53] scsi: qla2xxx: Fix MPI failure AEN (8200) handling Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 43/53] scsi: mptscsih: Fix read sense data size Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 44/53] usb: dwc3: pci: Fix reference count leak in dwc3_pci_resume_work Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 45/53] arm64: kpti: Add KRYO{3, 4}XX silver CPU cores to kpti safelist Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 46/53] block: release bip in a right way in error path Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 47/53] nvme-rdma: assign completion vector correctly Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 48/53] x86/entry: Increase entry_stack size to a full page Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 49/53] io_uring: fix current->mm NULL dereference on exit Sasha Levin
2020-07-02  1:21 ` [PATCH AUTOSEL 5.7 50/53] arm64: Add KRYO{3,4}XX silver CPU cores to SSB safelist Sasha Levin
2020-07-02  1:22 ` [PATCH AUTOSEL 5.7 51/53] nfs: Fix memory leak of export_path Sasha Levin
2020-07-02  1:22 ` Sasha Levin [this message]
2020-07-02  1:22 ` [PATCH AUTOSEL 5.7 53/53] sched/core: Check cpus_mask, not cpus_ptr in __set_cpus_allowed_ptr(), to fix mask corruption Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200702012202.2700645-52-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=daniel.thompson@linaro.org \
    --cc=dianders@chromium.org \
    --cc=kgdb-bugreport@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox