From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Sukadev Bhattiprolu <sukadev@linux.ibm.com>,
Dany Madden <drt@linux.ibm.com>, Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH 5.4 19/85] powerpc/vnic: Extend "failover pending" window
Date: Mon, 9 Nov 2020 13:55:16 +0100 [thread overview]
Message-ID: <20201109125023.510258188@linuxfoundation.org> (raw)
In-Reply-To: <20201109125022.614792961@linuxfoundation.org>
From: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
[ Upstream commit 1d8504937478fdc2f3ef2174a816fd3302eca882 ]
Commit 5a18e1e0c193b introduced the 'failover_pending' state to track
the "failover pending window" - where we wait for the partner to become
ready (after a transport event) before actually attempting to failover.
i.e window is between following two events:
a. we get a transport event due to a FAILOVER
b. later, we get CRQ_INITIALIZED indicating the partner is
ready at which point we schedule a FAILOVER reset.
and ->failover_pending is true during this window.
If during this window, we attempt to open (or close) a device, we pretend
that the operation succeded and let the FAILOVER reset path complete the
operation.
This is fine, except if the transport event ("a" above) occurs during the
open and after open has already checked whether a failover is pending. If
that happens, we fail the open, which can cause the boot scripts to leave
the interface down requiring administrator to manually bring up the device.
This fix "extends" the failover pending window till we are _actually_
ready to perform the failover reset (i.e until after we get the RTNL
lock). Since open() holds the RTNL lock, we can be sure that we either
finish the open or if the open() fails due to the failover pending window,
we can again pretend that open is done and let the failover complete it.
We could try and block the open until failover is completed but a) that
could still timeout the application and b) Existing code "pretends" that
failover occurred "just after" open succeeded, so marks the open successful
and lets the failover complete the open. So, mark the open successful even
if the transport event occurs before we actually start the open.
Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Acked-by: Dany Madden <drt@linux.ibm.com>
Link: https://lore.kernel.org/r/20201030170711.1562994-1-sukadev@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/ibm/ibmvnic.c | 36 ++++++++++++++++++++++++++++++++----
1 file changed, 32 insertions(+), 4 deletions(-)
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1109,18 +1109,27 @@ static int ibmvnic_open(struct net_devic
if (adapter->state != VNIC_CLOSED) {
rc = ibmvnic_login(netdev);
if (rc)
- return rc;
+ goto out;
rc = init_resources(adapter);
if (rc) {
netdev_err(netdev, "failed to initialize resources\n");
release_resources(adapter);
- return rc;
+ goto out;
}
}
rc = __ibmvnic_open(netdev);
+out:
+ /*
+ * If open fails due to a pending failover, set device state and
+ * return. Device operation will be handled by reset routine.
+ */
+ if (rc && adapter->failover_pending) {
+ adapter->state = VNIC_OPEN;
+ rc = 0;
+ }
return rc;
}
@@ -1842,6 +1851,13 @@ static int do_reset(struct ibmvnic_adapt
rwi->reset_reason);
rtnl_lock();
+ /*
+ * Now that we have the rtnl lock, clear any pending failover.
+ * This will ensure ibmvnic_open() has either completed or will
+ * block until failover is complete.
+ */
+ if (rwi->reset_reason == VNIC_RESET_FAILOVER)
+ adapter->failover_pending = false;
netif_carrier_off(netdev);
adapter->reset_reason = rwi->reset_reason;
@@ -2112,6 +2128,13 @@ static void __ibmvnic_reset(struct work_
/* CHANGE_PARAM requestor holds rtnl_lock */
rc = do_change_param_reset(adapter, rwi, reset_state);
} else if (adapter->force_reset_recovery) {
+ /*
+ * Since we are doing a hard reset now, clear the
+ * failover_pending flag so we don't ignore any
+ * future MOBILITY or other resets.
+ */
+ adapter->failover_pending = false;
+
/* Transport event occurred during previous reset */
if (adapter->wait_for_reset) {
/* Previous was CHANGE_PARAM; caller locked */
@@ -2176,9 +2199,15 @@ static int ibmvnic_reset(struct ibmvnic_
unsigned long flags;
int ret;
+ /*
+ * If failover is pending don't schedule any other reset.
+ * Instead let the failover complete. If there is already a
+ * a failover reset scheduled, we will detect and drop the
+ * duplicate reset when walking the ->rwi_list below.
+ */
if (adapter->state == VNIC_REMOVING ||
adapter->state == VNIC_REMOVED ||
- adapter->failover_pending) {
+ (adapter->failover_pending && reason != VNIC_RESET_FAILOVER)) {
ret = EBUSY;
netdev_dbg(netdev, "Adapter removing or pending failover, skipping reset\n");
goto err;
@@ -4532,7 +4561,6 @@ static void ibmvnic_handle_crq(union ibm
case IBMVNIC_CRQ_INIT:
dev_info(dev, "Partner initialized\n");
adapter->from_passive_init = true;
- adapter->failover_pending = false;
if (!completion_done(&adapter->init_done)) {
complete(&adapter->init_done);
adapter->init_done_rc = -EIO;
next prev parent reply other threads:[~2020-11-09 13:12 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-09 12:54 [PATCH 5.4 00/85] 5.4.76-rc1 review Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 5.4 01/85] drm/i915: Break up error capture compression loops with cond_resched() Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 5.4 02/85] drm/i915/gt: Delay execlist processing for tgl Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 03/85] drm/i915: Drop runtime-pm assert from vgpu io accessors Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 04/85] ASoC: Intel: Skylake: Add alternative topology binary name Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 05/85] linkage: Introduce new macros for assembler symbols Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 06/85] arm64: asm: Add new-style position independent function annotations Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 07/85] arm64: lib: Use modern annotations for assembly functions Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 08/85] arm64: Change .weak to SYM_FUNC_START_WEAK_PI for arch/arm64/lib/mem*.S Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 09/85] tipc: fix use-after-free in tipc_bcast_get_mode Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 10/85] ptrace: fix task_join_group_stop() for the case when current is traced Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 11/85] cadence: force nonlinear buffers to be cloned Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 12/85] chelsio/chtls: fix memory leaks caused by a race Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 13/85] chelsio/chtls: fix always leaking ctrl_skb Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 14/85] gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 15/85] gianfar: Account for Tx PTP timestamp in the skb headroom Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 16/85] ionic: check port ptr before use Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 17/85] ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 18/85] net: usb: qmi_wwan: add Telit LE910Cx 0x1230 composition Greg Kroah-Hartman
2020-11-09 12:55 ` Greg Kroah-Hartman [this message]
2020-11-09 12:55 ` [PATCH 5.4 20/85] sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platforms Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 21/85] sfp: Fix error handing in sfp_probe() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 22/85] Fonts: Replace discarded const qualifier Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 23/85] ALSA: hda/realtek - Fixed HP headset Mic cant be detected Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 24/85] ALSA: hda/realtek - Enable headphone for ASUS TM420 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 25/85] ALSA: usb-audio: Add implicit feedback quirk for Zoom UAC-2 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 26/85] ALSA: usb-audio: add usb vendor id as DSD-capable for Khadas devices Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 27/85] ALSA: usb-audio: Add implicit feedback quirk for Qu-16 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 28/85] ALSA: usb-audio: Add implicit feedback quirk for MODX Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 29/85] mm: mempolicy: fix potential pte_unmap_unlock pte error Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 30/85] lib/crc32test: remove extra local_irq_disable/enable Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 31/85] kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 32/85] mm: always have io_remap_pfn_range() set pgprot_decrypted() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 33/85] gfs2: Wake up when sd_glock_disposal becomes zero Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 34/85] ring-buffer: Fix recursion protection transitions between interrupt context Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 35/85] mtd: spi-nor: Dont copy self-pointing struct around Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 36/85] ftrace: Fix recursion check for NMI test Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 37/85] ftrace: Handle tracing when switching between context Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 38/85] regulator: defer probe when trying to get voltage from unresolved supply Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 39/85] spi: bcm2835: fix gpio cs level inversion Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 40/85] tracing: Fix out of bounds write in get_trace_buf Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 41/85] futex: Handle transient "ownerless" rtmutex state correctly Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 42/85] ARM: dts: sun4i-a10: fix cpu_alert temperature Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 43/85] arm64: dts: meson: add missing g12 rng clock Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 44/85] x86/kexec: Use up-to-dated screen_info copy to fill boot params Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 45/85] of: Fix reserved-memory overlap detection Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 46/85] drm/sun4i: frontend: Rework a bit the phase data Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 47/85] drm/sun4i: frontend: Reuse the ch0 phase for RGB formats Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 48/85] drm/sun4i: frontend: Fix the scaler phase on A33 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 49/85] blk-cgroup: Fix memleak on error path Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 50/85] blk-cgroup: Pre-allocate tree node on blkg_conf_prep Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 51/85] scsi: core: Dont start concurrent async scan on same host Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 52/85] drm/amdgpu: add DID for navi10 blockchain SKU Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 53/85] scsi: ibmvscsi: Fix potential race after loss of transport Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 54/85] vsock: use ns_capable_noaudit() on socket create Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 55/85] nvme-rdma: handle unexpected nvme completion data length Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 56/85] nvmet: fix a NULL pointer dereference when tracing the flush command Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 57/85] drm/vc4: drv: Add error handding for bind Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 58/85] ACPI: NFIT: Fix comparison to -ENXIO Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 59/85] usb: cdns3: gadget: suspicious implicit sign extension Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 60/85] drm/nouveau/nouveau: fix the start/end range for migration Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 61/85] drm/nouveau/gem: fix "refcount_t: underflow; use-after-free" Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 5.4 62/85] arm64/smp: Move rcu_cpu_starting() earlier Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 63/85] Revert "coresight: Make sysfs functional on topologies with per core sink" Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 64/85] vt: Disable KD_FONT_OP_COPY Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 65/85] fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 66/85] s390/pkey: fix paes selftest failure with paes and pkey static build Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 67/85] serial: 8250_mtk: Fix uart_get_baud_rate warning Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 68/85] serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 69/85] USB: serial: cyberjack: fix write-URB completion race Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 70/85] USB: serial: option: add Quectel EC200T module support Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 71/85] USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 72/85] USB: serial: option: add Telit FN980 composition 0x1055 Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 73/85] tty: serial: fsl_lpuart: add LS1028A support Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 74/85] tty: serial: fsl_lpuart: LS1021A has a FIFO size of 16 words, like LS1028A Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 75/85] usb: dwc3: ep0: Fix delay status handling Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 76/85] USB: Add NO_LPM quirk for Kingston flash drive Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 77/85] usb: mtu3: fix panic in mtu3_gadget_stop() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 78/85] drm/panfrost: Fix a deadlock between the shrinker and madvise path Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 79/85] ARC: stack unwinding: avoid indefinite looping Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 80/85] PM: runtime: Drop runtime PM references to supplier on link removal Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 81/85] PM: runtime: Drop pm_runtime_clean_up_links() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 82/85] PM: runtime: Resume the device earlier in __device_release_driver() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 83/85] xfs: flush for older, xfs specific ioctls Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 84/85] perf/core: Fix a memory leak in perf_event_parse_addr_filter() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 5.4 85/85] arm64: dts: marvell: espressobin: Add ethernet switch aliases Greg Kroah-Hartman
2020-11-09 23:05 ` [PATCH 5.4 00/85] 5.4.76-rc1 review Guenter Roeck
2020-11-09 23:22 ` Shuah Khan
2020-11-10 4:14 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201109125023.510258188@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=drt@linux.ibm.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=sukadev@linux.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox