stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Ido Schimmel <idosch@nvidia.com>,
	Petr Machata <petrm@nvidia.com>,
	"David S. Miller" <davem@davemloft.net>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 16/68] mlxsw: pci: Fix possible crash during initialization
Date: Mon, 24 Apr 2023 15:17:47 +0200	[thread overview]
Message-ID: <20230424131128.288202037@linuxfoundation.org> (raw)
In-Reply-To: <20230424131127.653885914@linuxfoundation.org>

From: Ido Schimmel <idosch@nvidia.com>

[ Upstream commit 1f64757ee2bb22a93ec89b4c71707297e8cca0ba ]

During initialization the driver issues a reset command via its command
interface in order to remove previous configuration from the device.

After issuing the reset, the driver waits for 200ms before polling on
the "system_status" register using memory-mapped IO until the device
reaches a ready state (0x5E). The wait is necessary because the reset
command only triggers the reset, but the reset itself happens
asynchronously. If the driver starts polling too soon, the read of the
"system_status" register will never return and the system will crash
[1].

The issue was discovered when the device was flashed with a development
firmware version where the reset routine took longer to complete. The
issue was fixed in the firmware, but it exposed the fact that the
current wait time is borderline.

Fix by increasing the wait time from 200ms to 400ms. With this patch and
the buggy firmware version, the issue did not reproduce in 10 reboots
whereas without the patch the issue is reproduced quite consistently.

[1]
mce: CPUs not responding to MCE broadcast (may include false positives): 0,4
mce: CPUs not responding to MCE broadcast (may include false positives): 0,4
Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception handler
Shutting down cpus with NMI
Kernel Offset: 0x12000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

Fixes: ac004e84164e ("mlxsw: pci: Wait longer before accessing the device after reset")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/net/ethernet/mellanox/mlxsw/pci_hw.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
index a2c1fbd3e0d13..0225c8f1e5ea2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
@@ -26,7 +26,7 @@
 #define MLXSW_PCI_CIR_TIMEOUT_MSECS		1000
 
 #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS	900000
-#define MLXSW_PCI_SW_RESET_WAIT_MSECS		200
+#define MLXSW_PCI_SW_RESET_WAIT_MSECS		400
 #define MLXSW_PCI_FW_READY			0xA1844
 #define MLXSW_PCI_FW_READY_MASK			0xFFFF
 #define MLXSW_PCI_FW_READY_MAGIC		0x5E
-- 
2.39.2




  parent reply	other threads:[~2023-04-24 13:34 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24 13:17 [PATCH 5.10 00/68] 5.10.179-rc1 review Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 01/68] ARM: dts: rockchip: fix a typo error for rk3288 spdif node Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 02/68] arm64: dts: qcom: ipq8074-hk01: enable QMP device, not the PHY node Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 03/68] arm64: dts: meson-g12-common: specify full DMC range Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 04/68] arm64: dts: imx8mm-evk: correct pmic clock source Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 05/68] netfilter: br_netfilter: fix recent physdev match breakage Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 06/68] regulator: fan53555: Explicitly include bits header Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 07/68] net: sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 08/68] virtio_net: bugfix overflow inside xdp_linearize_page() Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 09/68] sfc: Split STATE_READY in to STATE_NET_DOWN and STATE_NET_UP Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 10/68] sfc: Fix use-after-free due to selftest_work Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 11/68] netfilter: nf_tables: fix ifdef to also consider nf_tables=m Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 12/68] i40e: fix accessing vsi->active_filters without holding lock Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 13/68] i40e: fix i40e_setup_misc_vector() error handling Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 14/68] mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next() Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 15/68] net: rpl: fix rpl header size calculation Greg Kroah-Hartman
2023-04-24 13:17 ` Greg Kroah-Hartman [this message]
2023-04-24 13:17 ` [PATCH 5.10 17/68] bpf: Fix incorrect verifier pruning due to missing register precision taints Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 18/68] e1000e: Disable TSO on i219-LM card to increase speed Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 19/68] f2fs: Fix f2fs_truncate_partial_nodes ftrace event Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 20/68] Input: i8042 - add quirk for Fujitsu Lifebook A574/H Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 21/68] selftests: sigaltstack: fix -Wuninitialized Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 22/68] scsi: megaraid_sas: Fix fw_crash_buffer_show() Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 23/68] scsi: core: Improve scsi_vpd_inquiry() checks Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 24/68] net: dsa: b53: mmap: add phy ops Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 25/68] s390/ptrace: fix PTRACE_GET_LAST_BREAK error handling Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 26/68] nvme-tcp: fix a possible UAF when failing to allocate an io queue Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 27/68] xen/netback: use same error messages for same errors Greg Kroah-Hartman
2023-04-24 13:17 ` [PATCH 5.10 28/68] powerpc/doc: Fix htmldocs errors Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 29/68] xfs: drop submit side trans alloc for append ioends Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 30/68] iio: light: tsl2772: fix reading proximity-diodes from device tree Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 31/68] nilfs2: initialize unused bytes in segment summary blocks Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 32/68] memstick: fix memory leak if card device is never registered Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 33/68] kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 34/68] mmc: sdhci_am654: Set HIGH_SPEED_ENA for SDR12 and SDR25 Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 35/68] mm/khugepaged: check again on anon uffd-wp during isolation Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 36/68] sched/uclamp: Make task_fits_capacity() use util_fits_cpu() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 37/68] sched/uclamp: Fix fits_capacity() check in feec() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 38/68] sched/uclamp: Make select_idle_capacity() use util_fits_cpu() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 39/68] sched/uclamp: Make asym_fits_capacity() " Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 40/68] sched/uclamp: Make cpu_overutilized() " Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 41/68] sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()s early exit condition Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 42/68] sched/fair: Detect capacity inversion Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 43/68] sched/fair: Consider capacity inversion in util_fits_cpu() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 44/68] sched/uclamp: Fix a uninitialized variable warnings Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 45/68] sched/fair: Fixes for capacity inversion detection Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 46/68] MIPS: Define RUNTIME_DISCARD_EXIT in LD script Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 47/68] [PATCH v2 stable-5.10.y stable-5.15.y] docs: futex: Fix kernel-doc references after code split-up preparation Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 48/68] purgatory: fix disabling debug info Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 49/68] virtiofs: clean up error handling in virtio_fs_get_tree() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 50/68] virtiofs: split requests that exceed virtqueue size Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 51/68] fuse: check s_root when destroying sb Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 52/68] fuse: fix attr version comparison in fuse_read_update_size() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 53/68] fuse: always revalidate rename target dentry Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 54/68] fuse: fix deadlock between atomic O_TRUNC and page invalidation Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 55/68] Revert "ext4: fix use-after-free in ext4_xattr_set_entry" Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 56/68] ext4: remove duplicate definition of ext4_xattr_ibody_inline_set() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 57/68] ext4: fix use-after-free in ext4_xattr_set_entry Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 58/68] udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM) Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 59/68] tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 60/68] inet6: Remove inet6_destroy_sock() in sk->sk_prot->destroy() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 61/68] dccp: Call inet6_destroy_sock() via sk->sk_destruct() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 62/68] sctp: " Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 63/68] pwm: meson: Explicitly set .polarity in .get_state() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 64/68] pwm: iqs620a: " Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 65/68] pwm: hibvt: " Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 66/68] iio: adc: at91-sama5d2_adc: fix an error code in at91_adc_allocate_trigger() Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 67/68] ASoC: fsl_asrc_dma: fix potential null-ptr-deref Greg Kroah-Hartman
2023-04-24 13:18 ` [PATCH 5.10 68/68] ASN.1: Fix check for strdup() success Greg Kroah-Hartman
2023-04-25  1:04 ` [PATCH 5.10 00/68] 5.10.179-rc1 review Guenter Roeck
2023-04-25 13:28 ` Naresh Kamboju
2023-04-25 13:35 ` Chris Paterson
2023-04-25 18:43 ` Florian Fainelli
2023-04-26  0:28 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230424131128.288202037@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=idosch@nvidia.com \
    --cc=patches@lists.linux.dev \
    --cc=petrm@nvidia.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).