From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Jack Morgenstein <jackm@dev.mellanox.co.il>,
Leon Romanovsky <leon@kernel.org>,
Doug Ledford <dledford@redhat.com>
Subject: [PATCH 4.10 47/93] IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level
Date: Thu, 18 May 2017 12:47:16 +0200 [thread overview]
Message-ID: <20170518104745.049293845@linuxfoundation.org> (raw)
In-Reply-To: <20170518104743.163522815@linuxfoundation.org>
4.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jack Morgenstein <jackm@dev.mellanox.co.il>
commit fb7a91746af18b2ebf596778b38a709cdbc488d3 upstream.
A warning message during SRIOV multicast cleanup should have actually been
a debug level message. The condition generating the warning does no harm
and can fill the message log.
In some cases, during testing, some tests were so intense as to swamp the
message log with these warning messages, causing a stall in the console
message log output task. This stall caused an NMI to be sent to all CPUs
(so that they all dumped their stacks into the message log).
Aside from the message flood causing an NMI, the tests all passed.
Once the message flood which caused the NMI is removed (by reducing the
warning message to debug level), the NMI no longer occurs.
Sample message log (console log) output illustrating the flood and
resultant NMI (snippets with comments and modified with ... instead
of hex digits, to satisfy checkpatch.pl):
<mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!...
*** About 4000 almost identical lines in less than one second ***
<mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!...
INFO: rcu_sched detected stalls on CPUs/tasks: { 17} (...)
*** { 17} above indicates that CPU 17 was the one that stalled ***
sending NMI to all CPUs:
...
NMI backtrace for cpu 17
CPU: 17 PID: 45909 Comm: kworker/17:2
Hardware name: HP ProLiant DL360p Gen8, BIOS P71 09/08/2013
Workqueue: events fb_flashcursor
task: ffff880478...... ti: ffff88064e...... task.ti: ffff88064e......
RIP: 0010:[ffffffff81......] [ffffffff81......] io_serial_in+0x15/0x20
RSP: 0018:ffff88064e257cb0 EFLAGS: 00000002
RAX: 0000000000...... RBX: ffffffff81...... RCX: 0000000000......
RDX: 0000000000...... RSI: 0000000000...... RDI: ffffffff81......
RBP: ffff88064e...... R08: ffffffff81...... R09: 0000000000......
R10: 0000000000...... R11: ffff88064e...... R12: 0000000000......
R13: 0000000000...... R14: ffffffff81...... R15: 0000000000......
FS: 0000000000......(0000) GS:ffff8804af......(0000) knlGS:000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080......
CR2: 00007f2a2f...... CR3: 0000000001...... CR4: 0000000000......
DR0: 0000000000...... DR1: 0000000000...... DR2: 0000000000......
DR3: 0000000000...... DR6: 00000000ff...... DR7: 0000000000......
Stack:
ffff88064e...... ffffffff81...... ffffffff81...... 0000000000......
ffffffff81...... ffff88064e...... ffffffff81...... ffffffff81......
ffffffff81...... ffff88064e...... ffffffff81...... 0000000000......
Call Trace:
[<ffffffff813d099b>] wait_for_xmitr+0x3b/0xa0
[<ffffffff813d0b5c>] serial8250_console_putchar+0x1c/0x30
[<ffffffff813d0b40>] ? serial8250_console_write+0x140/0x140
[<ffffffff813cb5fa>] uart_console_write+0x3a/0x80
[<ffffffff813d0aae>] serial8250_console_write+0xae/0x140
[<ffffffff8107c4d1>] call_console_drivers.constprop.15+0x91/0xf0
[<ffffffff8107d6cf>] console_unlock+0x3bf/0x400
[<ffffffff813503cd>] fb_flashcursor+0x5d/0x140
[<ffffffff81355c30>] ? bit_clear+0x120/0x120
[<ffffffff8109d5fb>] process_one_work+0x17b/0x470
[<ffffffff8109e3cb>] worker_thread+0x11b/0x400
[<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400
[<ffffffff810a5aef>] kthread+0xcf/0xe0
[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
[<ffffffff81645858>] ret_from_fork+0x58/0x90
[<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140
Code: 48 89 e5 d3 e6 48 63 f6 48 03 77 10 8b 06 5d c3 66 0f 1f 44 00 00 66 66 66 6
As indicated in the stack trace above, the console output task got swamped.
Fixes: b9c5d6a64358 ("IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/infiniband/hw/mlx4/mcg.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -1102,7 +1102,8 @@ static void _mlx4_ib_mcg_port_cleanup(st
while ((p = rb_first(&ctx->mcg_table)) != NULL) {
group = rb_entry(p, struct mcast_group, node);
if (atomic_read(&group->refcount))
- mcg_warn_group(group, "group refcount %d!!! (pointer %p)\n", atomic_read(&group->refcount), group);
+ mcg_debug_group(group, "group refcount %d!!! (pointer %p)\n",
+ atomic_read(&group->refcount), group);
force_clean_group(group);
}
next prev parent reply other threads:[~2017-05-18 10:47 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-05-18 10:46 [PATCH 4.10 00/93] 4.10.17-stable review Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 01/93] xen: adjust early dom0 p2m handling to xen hypervisor behavior Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 02/93] target: Fix compare_and_write_callback handling for non GOOD status Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 03/93] target/fileio: Fix zero-length READ and WRITE handling Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 04/93] iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 05/93] usb: xhci: bInterval quirk for TI TUSB73x0 Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 06/93] usb: host: xhci: print correct command ring address Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 07/93] USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 08/93] USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 10/93] staging: vt6656: use off stack for in buffer USB transfers Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 11/93] staging: vt6656: use off stack for out " Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 12/93] staging: gdm724x: gdm_mux: fix use-after-free on module unload Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 13/93] staging: wilc1000: Fix problem with wrong vif index Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 14/93] staging: comedi: jr3_pci: fix possible null pointer dereference Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 15/93] staging: comedi: jr3_pci: cope with jiffies wraparound Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 16/93] usb: misc: add missing continue in switch Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 17/93] usb: gadget: legacy gadgets are optional Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 18/93] usb: Make sure usb/phy/of gets built-in Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 19/93] usb: hub: Fix error loop seen after hub communication errors Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 20/93] usb: hub: Do not attempt to autosuspend disconnected devices Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 21/93] usb: misc: legousbtower: Fix buffers on stack Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 22/93] x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 23/93] selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 24/93] x86, pmem: Fix cache flushing for iovec write < 8 bytes Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 25/93] um: Fix PTRACE_POKEUSER on x86_64 Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 26/93] perf/x86: Fix Broadwell-EP DRAM RAPL events Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 28/93] KVM: arm/arm64: fix races in kvm_psci_vcpu_on Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 29/93] arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accesses Greg Kroah-Hartman
2017-05-18 10:46 ` [PATCH 4.10 30/93] block: fix blk_integrity_register to use templates interval_exp if not 0 Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 31/93] crypto: s5p-sss - Close possible race for completed requests Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 32/93] crypto: algif_aead - Require setkey before accept(2) Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 33/93] crypto: ccp - Use only the relevant interrupt bits Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 34/93] crypto: ccp - Disable interrupts early on unload Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 35/93] crypto: ccp - Change ISR handler method for a v3 CCP Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 36/93] crypto: ccp - Change ISR handler method for a v5 CCP Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 37/93] dm crypt: rewrite (wipe) key in crypto layer using random data Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 38/93] dm era: save spacemap metadata root after the pre-commit Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 39/93] dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 40/93] dm thin: fix a memory leak when passing discard bio down Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 41/93] vfio/type1: Remove locked page accounting workqueue Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 42/93] iov_iter: dont revert iov buffer if csum error Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 43/93] IB/core: Fix sysfs registration error flow Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 44/93] IB/core: For multicast functions, verify that LIDs are multicast LIDs Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 45/93] IB/IPoIB: ibX: failed to create mcg debug file Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 46/93] IB/mlx4: Fix ib device initialization error flow Greg Kroah-Hartman
2017-05-18 10:47 ` Greg Kroah-Hartman [this message]
2017-05-18 10:47 ` [PATCH 4.10 48/93] IB/hfi1: Prevent kernel QP post send hard lockups Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 49/93] perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 50/93] perf annotate s390: Fix perf annotate error -95 (4.10 regression) Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 51/93] perf annotate s390: Implement jump types for perf annotate Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 52/93] jbd2: fix dbench4 performance regression for nobarrier mounts Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 53/93] ext4: evict inline data when writing to memory map Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 54/93] orangefs: fix bounds check for listxattr Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 55/93] orangefs: clean up oversize xattr validation Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 56/93] orangefs: do not set getattr_time on orangefs_lookup Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 57/93] orangefs: do not check possibly stale size on truncate Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 58/93] fs/xattr.c: zero out memory copied to userspace in getxattr Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 59/93] ceph: fix memory leak in __ceph_setxattr() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 60/93] fs/block_dev: always invalidate cleancache in invalidate_bdev() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 61/93] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 62/93] Fix match_prepath() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 63/93] Set unicode flag on cifs echo request to avoid Mac error Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 64/93] SMB3: Work around mount failure when using SMB3 dialect to Macs Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 66/93] cifs: fix leak in FSCTL_ENUM_SNAPS response handling Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 67/93] cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 68/93] CIFS: fix oplock break deadlocks Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 69/93] cifs: fix CIFS_IOC_GET_MNT_INFO oops Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 71/93] ovl: do not set overlay.opaque on non-dir create Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 72/93] padata: free correct variable Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 73/93] md/raid1: avoid reusing a resync bio after error handling Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 74/93] device-dax: fix cdev leak Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 75/93] device-dax: fix sysfs attribute deadlock Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 76/93] dax: prevent invalidation of mapped DAX entries Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 77/93] mm: fix data corruption due to stale mmap reads Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 78/93] f2fs: fix fs corruption due to zero inode page Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 79/93] fscrypt: fix context consistency check when key(s) unavailable Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 80/93] serial: samsung: Use right device for DMA-mapping calls Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 81/93] serial: omap: fix runtime-pm handling on unbind Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 82/93] serial: omap: suspend device on probe errors Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 83/93] tty: pty: Fix ldisc flush after userspace become aware of the data already Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 84/93] Bluetooth: Fix user channel for 32bit userspace on 64bit kernel Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 85/93] Bluetooth: hci_bcm: add missing tty-device sanity check Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 86/93] Bluetooth: hci_intel: " Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 87/93] ipmi: Fix kernel panic at ipmi_ssif_thread() Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 88/93] libnvdimm, region: fix flush hint detection crash Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 89/93] libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify Greg Kroah-Hartman
2017-05-18 10:47 ` [PATCH 4.10 90/93] libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering Greg Kroah-Hartman
2017-05-18 10:48 ` [PATCH 4.10 91/93] libnvdimm, pfn: fix npfns vs section alignment Greg Kroah-Hartman
2017-05-18 10:48 ` [PATCH 4.10 92/93] pstore: Fix flags to enable dumps on powerpc Greg Kroah-Hartman
2017-05-18 10:48 ` [PATCH 4.10 93/93] pstore: Shut down worker when unregistering Greg Kroah-Hartman
2017-05-18 17:31 ` [PATCH 4.10 00/93] 4.10.17-stable review Shuah Khan
2017-05-19 1:10 ` Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170518104745.049293845@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dledford@redhat.com \
--cc=jackm@dev.mellanox.co.il \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).