All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Eric Dumazet <edumazet@google.com>,
	Alex Thorlton <athorlton@sgi.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 3.14 66/70] dump_stack: avoid potential deadlocks
Date: Tue, 23 Feb 2016 19:34:15 -0800	[thread overview]
Message-ID: <20160224033356.197414597@linuxfoundation.org> (raw)
In-Reply-To: <20160224033354.061464831@linuxfoundation.org>

3.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

commit d7ce36924344ace0dbdc855b1206cacc46b36d45 upstream.

Some servers experienced fatal deadlocks because of a combination of
bugs, leading to multiple cpus calling dump_stack().

The checksumming bug was fixed in commit 34ae6a1aa054 ("ipv6: update
skb->csum when CE mark is propagated").

The second problem is a faulty locking in dump_stack()

CPU1 runs in process context and calls dump_stack(), grabs dump_lock.

   CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
   call dump_stack() from netdev_rx_csum_fault().

   dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
   dump_lock is owned by CPU1

While dumping its stack, CPU1 is interrupted by a softirq, and happens
to process a packet for the TCP socket locked by CPU2.

CPU1 spins forever in spin_lock() : deadlock

Stack trace on CPU1 looked like :

    NMI backtrace for cpu 1
    RIP: _raw_spin_lock+0x25/0x30
    ...
    Call Trace:
      <IRQ>
      tcp_v6_rcv+0x243/0x620
      ip6_input_finish+0x11f/0x330
      ip6_input+0x38/0x40
      ip6_rcv_finish+0x3c/0x90
      ipv6_rcv+0x2a9/0x500
      process_backlog+0x461/0xaa0
      net_rx_action+0x147/0x430
      __do_softirq+0x167/0x2d0
      call_softirq+0x1c/0x30
      do_softirq+0x3f/0x80
      irq_exit+0x6e/0xc0
      smp_call_function_single_interrupt+0x35/0x40
      call_function_single_interrupt+0x6a/0x70
      <EOI>
      printk+0x4d/0x4f
      printk_address+0x31/0x33
      print_trace_address+0x33/0x3c
      print_context_stack+0x7f/0x119
      dump_trace+0x26b/0x28e
      show_trace_log_lvl+0x4f/0x5c
      show_stack_log_lvl+0x104/0x113
      show_stack+0x42/0x44
      dump_stack+0x46/0x58
      netdev_rx_csum_fault+0x38/0x3c
      __skb_checksum_complete_head+0x6e/0x80
      __skb_checksum_complete+0x11/0x20
      tcp_rcv_established+0x2bd5/0x2fd0
      tcp_v6_do_rcv+0x13c/0x620
      sk_backlog_rcv+0x15/0x30
      release_sock+0xd2/0x150
      tcp_recvmsg+0x1c1/0xfc0
      inet_recvmsg+0x7d/0x90
      sock_recvmsg+0xaf/0xe0
      ___sys_recvmsg+0x111/0x3b0
      SyS_recvmsg+0x5c/0xb0
      system_call_fastpath+0x16/0x1b

Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alex Thorlton <athorlton@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 lib/dump_stack.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

--- a/lib/dump_stack.c
+++ b/lib/dump_stack.c
@@ -25,6 +25,7 @@ static atomic_t dump_lock = ATOMIC_INIT(
 
 asmlinkage void dump_stack(void)
 {
+	unsigned long flags;
 	int was_locked;
 	int old;
 	int cpu;
@@ -33,9 +34,8 @@ asmlinkage void dump_stack(void)
 	 * Permit this cpu to perform nested stack dumps while serialising
 	 * against other CPUs
 	 */
-	preempt_disable();
-
 retry:
+	local_irq_save(flags);
 	cpu = smp_processor_id();
 	old = atomic_cmpxchg(&dump_lock, -1, cpu);
 	if (old == -1) {
@@ -43,6 +43,7 @@ retry:
 	} else if (old == cpu) {
 		was_locked = 1;
 	} else {
+		local_irq_restore(flags);
 		cpu_relax();
 		goto retry;
 	}
@@ -52,7 +53,7 @@ retry:
 	if (!was_locked)
 		atomic_set(&dump_lock, -1);
 
-	preempt_enable();
+	local_irq_restore(flags);
 }
 #else
 asmlinkage void dump_stack(void)

  parent reply	other threads:[~2016-02-24  4:53 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24  3:33 [PATCH 3.14 00/70] 3.14.62-stable review Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 02/70] drm/i915: get runtime PM reference around GEM set_caching IOCTL Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 03/70] ALSA: seq: Fix double port list deletion Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 04/70] phy: twl4030-usb: Relase usb phy on unload Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 05/70] wan/x25: Fix use-after-free in x25_asy_open_tty() Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 06/70] staging/speakup: Use tty_ldisc_ref() for paste kworker Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 07/70] pty: fix possible use after free of tty->driver_data Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 08/70] pty: make sure super_block is still valid in final /dev/tty close Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 09/70] serial: 8250_pci: Correct uartclk for xr17v35x expansion chips Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 10/70] AIO: properly check iovec sizes Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 11/70] ext4: fix potential integer overflow Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 13/70] Btrfs: fix hang on extent buffer lock caused by the inode_paths ioctl Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 15/70] perf trace: Fix documentation for -i Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 16/70] ptrace: use fsuid, fsgid, effective creds for fs access checks Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 17/70] tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit machines Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 18/70] tracing: Fix freak link error caused by branch tracer Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 19/70] klist: fix starting point removed bug in klist iterators Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 20/70] scsi: restart list search after unlock in scsi_remove_target Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 21/70] scsi_sysfs: Fix queue_ramp_up_period return code Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 22/70] iscsi-target: Fix rx_login_comp hang after login failure Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 23/70] Fix a memory leak in scsi_host_dev_release() Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 24/70] SCSI: Fix NULL pointer dereference in runtime PM Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 25/70] Revert "SCSI: Fix NULL pointer dereference in runtime PM" Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 26/70] iscsi-target: Fix potential dead-lock during node acl delete Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 27/70] SCSI: fix crashes in sd and sr runtime PM Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 28/70] drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 29/70] scsi_dh_rdac: always retry MODE SELECT on command lock violation Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 30/70] SCSI: Add Marvell Console to VPD blacklist Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 31/70] scsi: fix soft lockup in scsi_remove_target() on module removal Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 32/70] iio:ad7793: Fix ad7785 product ID Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 33/70] iio: lpc32xx_adc: fix warnings caused by enabling unprepared clock Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 34/70] iio:ad5064: Make sure ad5064_i2c_write() returns 0 on success Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 35/70] iio: ad5064: Fix ad5629/ad5669 shift Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 36/70] iio: fix some warning messages Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 37/70] iio: adis_buffer: Fix out-of-bounds memory access Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 38/70] iio: dac: mcp4725: set iio name property in sysfs Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 39/70] cifs_dbg() outputs an uninitialized buffer in cifs_readdir() Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 40/70] cifs: fix erroneous return value Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 41/70] nfs: Fix race in __update_open_stateid() Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 42/70] udf: limit the maximum number of indirect extents in a row Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 43/70] udf: Prevent buffer overrun with multi-byte characters Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 44/70] udf: Check output buffer length when converting name to CS0 Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 45/70] ARM: dts: Kirkwood: Fix QNAP TS219 power-off Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 46/70] ARM: 8471/1: need to save/restore arm register(r11) when it is corrupted Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 47/70] ARM: 8519/1: ICST: try other dividends than 1 Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 48/70] ARM: 8517/1: ICST: avoid arithmetic overflow in icst_hz() Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 49/70] KVM: PPC: Fix emulation of H_SET_DABR/X on POWER8 Greg Kroah-Hartman
2016-02-24  3:33 ` [PATCH 3.14 50/70] fuse: break infinite loop in fuse_fill_write_pages() Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 51/70] mm: soft-offline: check return value in second __get_any_page() call Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 52/70] mm: fix mlock accouting Greg Kroah-Hartman
2016-02-25  1:02   ` Luis Henriques
2016-02-25  1:02     ` Luis Henriques
2016-02-25  9:04     ` Michal Hocko
2016-02-25  9:04       ` Michal Hocko
2016-02-25  9:48       ` Jiri Slaby
2016-02-25 19:22       ` Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 53/70] Input: elantech - add Fujitsu Lifebook U745 to force crc_enabled Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 54/70] Input: elantech - mark protocols v2 and v3 as semi-mt Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 56/70] iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 57/70] mm/memory_hotplug.c: check for missing sections in test_pages_in_a_zone() Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 58/70] xhci: Fix list corruption in urb dequeue at host removal Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 59/70] m32r: fix m32104ut_defconfig build fail Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 60/70] dma-debug: Fix dma_debug_entry offset calculation Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 61/70] dma-debug: switch check from _text to _stext Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 62/70] scripts/bloat-o-meter: fix python3 syntax error Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 63/70] memcg: only free spare array when readers are done Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 64/70] radix-tree: fix race in gang lookup Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 65/70] radix-tree: fix oops after radix_tree_iter_retry Greg Kroah-Hartman
2016-02-24  3:34 ` Greg Kroah-Hartman [this message]
2016-02-24  3:34 ` [PATCH 3.14 67/70] intel_scu_ipcutil: underflow in scu_reg_access() Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 68/70] futex: Drop refcount if requeue_pi() acquired the rtmutex Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 69/70] ip6mr: call del_timer_sync() in ip6mr_free_table() Greg Kroah-Hartman
2016-02-24  3:34 ` [PATCH 3.14 70/70] module: wrapper for symbol name Greg Kroah-Hartman
2016-02-24 18:28 ` [PATCH 3.14 00/70] 3.14.62-stable review Shuah Khan
2016-02-25  5:52 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160224033356.197414597@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=athorlton@sgi.com \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.