* [net PATCH 0/6] fbnic: FW IPC Mailbox fixes
@ 2025-05-01 23:29 Alexander Duyck
2025-05-01 23:29 ` [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings Alexander Duyck
` (5 more replies)
0 siblings, 6 replies; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:29 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
This series is meant to address a number of issues that have been found in
the FW IPC mailbox over the past several months.
The main issues addressed are:
1. Resolve a potential race between host and FW during initialization that
can cause the FW to only have the lower 32b of an address.
2. Block the FW from issuing DMA requests after we have closed the mailbox
and before we have started issuing requests on it.
3. Fix races in the IRQ handlers that can cause the IRQ to unmask itself if
it is being processed while we are trying to disable it.
4. Cleanup the Tx flush logic so that we actually lock down the Tx path
before we start flushing it instead of letting it free run while we are
shutting it down.
5. Fix several memory leaks that could occur if we failed initialization.
6. Cleanup the mailbox completion if we are flushing Tx since we are no
longer processing Rx.
7. Move several allocations out of a potential IRQ/atomic context.
There have been a few optimizations we also picked up since then. Rather
than split them out I just folded them into these diffs. They mostly
address minor issues such as how long it takes to initialize and/or fail so
I thought they could probably go in with the rest of the patches. They
consist of:
1. Do not sleep more than 20ms waiting on FW to respond as the 200ms value
likely originated from simulation/emulation testing.
2. Use jiffies to determine timeout instead of sleep * attempts for better
accuracy.
---
Alexander Duyck (6):
fbnic: Fix initialization of mailbox descriptor rings
fbnic: Gate AXI read/write enabling on FW mailbox
fbnic: Add additional handling of IRQs
fbnic: Actually flush_tx instead of stalling out
fbnic: Cleanup handling of completions
fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
drivers/net/ethernet/meta/fbnic/fbnic.h | 8 +-
drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 2 +
drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 217 +++++++++++-------
drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 2 +-
drivers/net/ethernet/meta/fbnic/fbnic_irq.c | 142 ++++++++----
drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 8 +-
.../net/ethernet/meta/fbnic/fbnic_netdev.c | 5 +-
drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 14 +-
8 files changed, 254 insertions(+), 144 deletions(-)
--
^ permalink raw reply [flat|nested] 16+ messages in thread
* [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
@ 2025-05-01 23:29 ` Alexander Duyck
2025-05-02 10:49 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 2/6] fbnic: Gate AXI read/write enabling on FW mailbox Alexander Duyck
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:29 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
From: Alexander Duyck <alexanderduyck@fb.com>
Address to issues with the FW mailbox descriptor initialization.
We need to reverse the order of accesses when we invalidate an entry versus
writing an entry. When writing an entry we write upper and then lower as
the lower 32b contain the valid bit that makes the entire address valid.
However for invalidation we should write it in the reverse order so that
the upper is marked invalid before we update it.
Without this change we may see FW attempt to access pages with the upper
32b of the address set to 0 which will likely result in DMAR faults due to
write access failures on mailbox shutdown.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 32 ++++++++++++++++++++--------
1 file changed, 23 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
index 88db3dacb940..c4956f0a741e 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
@@ -17,11 +17,29 @@ static void __fbnic_mbx_wr_desc(struct fbnic_dev *fbd, int mbx_idx,
{
u32 desc_offset = FBNIC_IPC_MBX(mbx_idx, desc_idx);
+ /* Write the upper 32b and then the lower 32b. Doing this the
+ * FW can then read lower, upper, lower to verify that the state
+ * of the descriptor wasn't changed mid-transaction.
+ */
fw_wr32(fbd, desc_offset + 1, upper_32_bits(desc));
fw_wrfl(fbd);
fw_wr32(fbd, desc_offset, lower_32_bits(desc));
}
+static void __fbnic_mbx_invalidate_desc(struct fbnic_dev *fbd, int mbx_idx,
+ int desc_idx, u32 desc)
+{
+ u32 desc_offset = FBNIC_IPC_MBX(mbx_idx, desc_idx);
+
+ /* For initialization we write the lower 32b of the descriptor first.
+ * This way we can set the state to mark it invalid before we clear the
+ * upper 32b.
+ */
+ fw_wr32(fbd, desc_offset, desc);
+ fw_wrfl(fbd);
+ fw_wr32(fbd, desc_offset + 1, 0);
+}
+
static u64 __fbnic_mbx_rd_desc(struct fbnic_dev *fbd, int mbx_idx, int desc_idx)
{
u32 desc_offset = FBNIC_IPC_MBX(mbx_idx, desc_idx);
@@ -41,21 +59,17 @@ static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
* solid stop for the firmware to hit when it is done looping
* through the ring.
*/
- __fbnic_mbx_wr_desc(fbd, mbx_idx, 0, 0);
-
- fw_wrfl(fbd);
+ __fbnic_mbx_invalidate_desc(fbd, mbx_idx, 0, 0);
/* We then fill the rest of the ring starting at the end and moving
* back toward descriptor 0 with skip descriptors that have no
* length nor address, and tell the firmware that they can skip
* them and just move past them to the one we initialized to 0.
*/
- for (desc_idx = FBNIC_IPC_MBX_DESC_LEN; --desc_idx;) {
- __fbnic_mbx_wr_desc(fbd, mbx_idx, desc_idx,
- FBNIC_IPC_MBX_DESC_FW_CMPL |
- FBNIC_IPC_MBX_DESC_HOST_CMPL);
- fw_wrfl(fbd);
- }
+ for (desc_idx = FBNIC_IPC_MBX_DESC_LEN; --desc_idx;)
+ __fbnic_mbx_invalidate_desc(fbd, mbx_idx, desc_idx,
+ FBNIC_IPC_MBX_DESC_FW_CMPL |
+ FBNIC_IPC_MBX_DESC_HOST_CMPL);
}
void fbnic_mbx_init(struct fbnic_dev *fbd)
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net PATCH 2/6] fbnic: Gate AXI read/write enabling on FW mailbox
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
2025-05-01 23:29 ` [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings Alexander Duyck
@ 2025-05-01 23:30 ` Alexander Duyck
2025-05-02 10:50 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 3/6] fbnic: Add additional handling of IRQs Alexander Duyck
` (3 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:30 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
From: Alexander Duyck <alexanderduyck@fb.com>
In order to prevent the device from throwing spurious writes and/or reads
at us we need to gate the AXI fabric interface to the PCIe until such time
as we know the FW is in a known good state.
To accomplish this we use the mailbox as a mechanism for us to recognize
that the FW has acknowledged our presence and is no longer sending any
stale message data to us.
We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function,
disabling the DMA in both directions, and then invalidating all the
descriptors in each ring.
We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt
is set by the FW we pick it up and mark the mailboxes as ready, while also
enabling the DMA.
Once we have completed all the transactions and need to shut down we call
into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for
each ring and shut down the DMA and once again invalidate the descriptors.
Fixes: 3646153161f1 ("eth: fbnic: Add register init to set PCIe/Ethernet device config")
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_csr.h | 2 +
drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 38 ++++++++++++++++++++++-----
drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 6 ----
3 files changed, 33 insertions(+), 13 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
index 3b12a0ab5906..51bee8072420 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_csr.h
@@ -796,8 +796,10 @@ enum {
/* PUL User Registers */
#define FBNIC_CSR_START_PUL_USER 0x31000 /* CSR section delimiter */
#define FBNIC_PUL_OB_TLP_HDR_AW_CFG 0x3103d /* 0xc40f4 */
+#define FBNIC_PUL_OB_TLP_HDR_AW_CFG_FLUSH CSR_BIT(19)
#define FBNIC_PUL_OB_TLP_HDR_AW_CFG_BME CSR_BIT(18)
#define FBNIC_PUL_OB_TLP_HDR_AR_CFG 0x3103e /* 0xc40f8 */
+#define FBNIC_PUL_OB_TLP_HDR_AR_CFG_FLUSH CSR_BIT(19)
#define FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME CSR_BIT(18)
#define FBNIC_PUL_USER_OB_RD_TLP_CNT_31_0 \
0x3106e /* 0xc41b8 */
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
index c4956f0a741e..f4749bfd840c 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
@@ -51,10 +51,26 @@ static u64 __fbnic_mbx_rd_desc(struct fbnic_dev *fbd, int mbx_idx, int desc_idx)
return desc;
}
-static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
+static void fbnic_mbx_reset_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
{
int desc_idx;
+ /* Disable DMA transactions from the device,
+ * and flush any transactions triggered during cleaning
+ */
+ switch (mbx_idx) {
+ case FBNIC_IPC_MBX_RX_IDX:
+ wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AW_CFG,
+ FBNIC_PUL_OB_TLP_HDR_AW_CFG_FLUSH);
+ break;
+ case FBNIC_IPC_MBX_TX_IDX:
+ wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG,
+ FBNIC_PUL_OB_TLP_HDR_AR_CFG_FLUSH);
+ break;
+ }
+
+ wrfl(fbd);
+
/* Initialize first descriptor to all 0s. Doing this gives us a
* solid stop for the firmware to hit when it is done looping
* through the ring.
@@ -90,7 +106,7 @@ void fbnic_mbx_init(struct fbnic_dev *fbd)
wr32(fbd, FBNIC_INTR_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++)
- fbnic_mbx_init_desc_ring(fbd, i);
+ fbnic_mbx_reset_desc_ring(fbd, i);
}
static int fbnic_mbx_map_msg(struct fbnic_dev *fbd, int mbx_idx,
@@ -155,7 +171,7 @@ static void fbnic_mbx_clean_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
{
int i;
- fbnic_mbx_init_desc_ring(fbd, mbx_idx);
+ fbnic_mbx_reset_desc_ring(fbd, mbx_idx);
for (i = FBNIC_IPC_MBX_DESC_LEN; i--;)
fbnic_mbx_unmap_and_free_msg(fbd, mbx_idx, i);
@@ -354,7 +370,7 @@ static int fbnic_fw_xmit_cap_msg(struct fbnic_dev *fbd)
return (err == -EOPNOTSUPP) ? 0 : err;
}
-static void fbnic_mbx_postinit_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
+static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
{
struct fbnic_fw_mbx *mbx = &fbd->mbx[mbx_idx];
@@ -366,10 +382,18 @@ static void fbnic_mbx_postinit_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
switch (mbx_idx) {
case FBNIC_IPC_MBX_RX_IDX:
+ /* Enable DMA writes from the device */
+ wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AW_CFG,
+ FBNIC_PUL_OB_TLP_HDR_AW_CFG_BME);
+
/* Make sure we have a page for the FW to write to */
fbnic_mbx_alloc_rx_msgs(fbd);
break;
case FBNIC_IPC_MBX_TX_IDX:
+ /* Enable DMA reads from the device */
+ wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG,
+ FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME);
+
/* Force version to 1 if we successfully requested an update
* from the firmware. This should be overwritten once we get
* the actual version from the firmware in the capabilities
@@ -386,7 +410,7 @@ static void fbnic_mbx_postinit(struct fbnic_dev *fbd)
{
int i;
- /* We only need to do this on the first interrupt following init.
+ /* We only need to do this on the first interrupt following reset.
* this primes the mailbox so that we will have cleared all the
* skip descriptors.
*/
@@ -396,7 +420,7 @@ static void fbnic_mbx_postinit(struct fbnic_dev *fbd)
wr32(fbd, FBNIC_INTR_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++)
- fbnic_mbx_postinit_desc_ring(fbd, i);
+ fbnic_mbx_init_desc_ring(fbd, i);
}
/**
@@ -894,7 +918,7 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
* avoid the mailbox getting stuck closed if the interrupt
* is reset.
*/
- fbnic_mbx_init_desc_ring(fbd, FBNIC_IPC_MBX_TX_IDX);
+ fbnic_mbx_reset_desc_ring(fbd, FBNIC_IPC_MBX_TX_IDX);
msleep(200);
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
index 14291401f463..dde4a37116e2 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
@@ -79,12 +79,6 @@ static void fbnic_mac_init_axi(struct fbnic_dev *fbd)
fbnic_init_readrq(fbd, FBNIC_QM_RNI_RBP_CTL, cls, readrq);
fbnic_init_mps(fbd, FBNIC_QM_RNI_RDE_CTL, cls, mps);
fbnic_init_mps(fbd, FBNIC_QM_RNI_RCM_CTL, cls, mps);
-
- /* Enable XALI AR/AW outbound */
- wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AW_CFG,
- FBNIC_PUL_OB_TLP_HDR_AW_CFG_BME);
- wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG,
- FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME);
}
static void fbnic_mac_init_qm(struct fbnic_dev *fbd)
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net PATCH 3/6] fbnic: Add additional handling of IRQs
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
2025-05-01 23:29 ` [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings Alexander Duyck
2025-05-01 23:30 ` [net PATCH 2/6] fbnic: Gate AXI read/write enabling on FW mailbox Alexander Duyck
@ 2025-05-01 23:30 ` Alexander Duyck
2025-05-02 13:51 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 4/6] fbnic: Actually flush_tx instead of stalling out Alexander Duyck
` (2 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:30 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
From: Alexander Duyck <alexanderduyck@fb.com>
We have two issues that need to be addressed in our IRQ handling.
One is the fact that we can end up double-freeing IRQs in the event of an
exception handling error such as a PCIe reset/recovery that fails. To
prevent that from becoming an issue we can use the msix_vector values to
indicate that we have successfully requested/freed the IRQ by only setting
or clearing them when we have completed the tiven action.
The other issue is that we have several potential races in our IRQ path due
to us manipulating the mask before the vector has been truly disabled. In
order to handle that in the case of the FW mailbox we need to not
auto-enable the IRQ and instead will be enabling/disabling it separately.
In the case of the PCS vector we can mitigate this by unmapping it and
synchronizing the IRQ before we clear the mask.
The general order of operations after this change is now to request the
interrupt, poll the FW mailbox to ready, and then enable the interrupt. For
the shutdown we do the reverse where we disable the interrupt, flush any
pending Tx, and then free the IRQ. I am renaming the enable/disable to
request/free to be equivilent with the IRQ calls being used. We may see
additions in the future to enable/disable the IRQs versus request/free them
for certain use cases.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
Fixes: 69684376eed5 ("eth: fbnic: Add link detection")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
drivers/net/ethernet/meta/fbnic/fbnic.h | 8 +
drivers/net/ethernet/meta/fbnic/fbnic_irq.c | 142 ++++++++++++++++--------
drivers/net/ethernet/meta/fbnic/fbnic_netdev.c | 5 +
drivers/net/ethernet/meta/fbnic/fbnic_pci.c | 14 +-
4 files changed, 110 insertions(+), 59 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic.h b/drivers/net/ethernet/meta/fbnic/fbnic.h
index 4ca7b99ef131..de6b1a340f55 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic.h
@@ -154,14 +154,14 @@ struct fbnic_dev *fbnic_devlink_alloc(struct pci_dev *pdev);
void fbnic_devlink_register(struct fbnic_dev *fbd);
void fbnic_devlink_unregister(struct fbnic_dev *fbd);
-int fbnic_fw_enable_mbx(struct fbnic_dev *fbd);
-void fbnic_fw_disable_mbx(struct fbnic_dev *fbd);
+int fbnic_fw_request_mbx(struct fbnic_dev *fbd);
+void fbnic_fw_free_mbx(struct fbnic_dev *fbd);
void fbnic_hwmon_register(struct fbnic_dev *fbd);
void fbnic_hwmon_unregister(struct fbnic_dev *fbd);
-int fbnic_pcs_irq_enable(struct fbnic_dev *fbd);
-void fbnic_pcs_irq_disable(struct fbnic_dev *fbd);
+int fbnic_pcs_request_irq(struct fbnic_dev *fbd);
+void fbnic_pcs_free_irq(struct fbnic_dev *fbd);
void fbnic_napi_name_irqs(struct fbnic_dev *fbd);
int fbnic_napi_request_irq(struct fbnic_dev *fbd,
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_irq.c b/drivers/net/ethernet/meta/fbnic/fbnic_irq.c
index 1bbc0e56f3a0..1c88a2bf3a7a 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_irq.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_irq.c
@@ -19,69 +19,105 @@ static irqreturn_t fbnic_fw_msix_intr(int __always_unused irq, void *data)
return IRQ_HANDLED;
}
+static int __fbnic_fw_enable_mbx(struct fbnic_dev *fbd, int vector)
+{
+ int err;
+
+ /* Initialize mailbox and attempt to poll it into ready state */
+ fbnic_mbx_init(fbd);
+ err = fbnic_mbx_poll_tx_ready(fbd);
+ if (err) {
+ dev_warn(fbd->dev, "FW mailbox did not enter ready state\n");
+ return err;
+ }
+
+ /* Enable interrupt and unmask the vector */
+ enable_irq(vector);
+ fbnic_wr32(fbd, FBNIC_INTR_MASK_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
+
+ return 0;
+}
+
/**
- * fbnic_fw_enable_mbx - Configure and initialize Firmware Mailbox
+ * fbnic_fw_request_mbx - Configure and initialize Firmware Mailbox
* @fbd: Pointer to device to initialize
*
- * This function will initialize the firmware mailbox rings, enable the IRQ
- * and initialize the communication between the Firmware and the host. The
- * firmware is expected to respond to the initialization by sending an
- * interrupt essentially notifying the host that it has seen the
- * initialization and is now synced up.
+ * This function will allocate the IRQ and then reinitialize the mailbox
+ * starting communication between the host and firmware.
*
* Return: non-zero on failure.
**/
-int fbnic_fw_enable_mbx(struct fbnic_dev *fbd)
+int fbnic_fw_request_mbx(struct fbnic_dev *fbd)
{
- u32 vector = fbd->fw_msix_vector;
- int err;
+ struct pci_dev *pdev = to_pci_dev(fbd->dev);
+ int vector, err;
+
+ WARN_ON(fbd->fw_msix_vector);
+
+ vector = pci_irq_vector(pdev, FBNIC_FW_MSIX_ENTRY);
+ if (vector < 0)
+ return vector;
/* Request the IRQ for FW Mailbox vector. */
err = request_threaded_irq(vector, NULL, &fbnic_fw_msix_intr,
- IRQF_ONESHOT, dev_name(fbd->dev), fbd);
+ IRQF_ONESHOT | IRQF_NO_AUTOEN,
+ dev_name(fbd->dev), fbd);
if (err)
return err;
/* Initialize mailbox and attempt to poll it into ready state */
- fbnic_mbx_init(fbd);
- err = fbnic_mbx_poll_tx_ready(fbd);
- if (err) {
- dev_warn(fbd->dev, "FW mailbox did not enter ready state\n");
+ err = __fbnic_fw_enable_mbx(fbd, vector);
+ if (err)
free_irq(vector, fbd);
- return err;
- }
- /* Enable interrupts */
- fbnic_wr32(fbd, FBNIC_INTR_MASK_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
+ fbd->fw_msix_vector = vector;
- return 0;
+ return err;
}
/**
- * fbnic_fw_disable_mbx - Disable mailbox and place it in standby state
- * @fbd: Pointer to device to disable
+ * fbnic_fw_disable_mbx - Temporarily place mailbox in standby state
+ * @fbd: Pointer to device
*
- * This function will disable the mailbox interrupt, free any messages still
- * in the mailbox and place it into a standby state. The firmware is
- * expected to see the update and assume that the host is in the reset state.
+ * Shutdown the mailbox by notifying the firmware to stop sending us logs, mask
+ * and synchronize the IRQ, and then clean up the rings.
**/
-void fbnic_fw_disable_mbx(struct fbnic_dev *fbd)
+static void fbnic_fw_disable_mbx(struct fbnic_dev *fbd)
{
- /* Disable interrupt and free vector */
- fbnic_wr32(fbd, FBNIC_INTR_MASK_SET(0), 1u << FBNIC_FW_MSIX_ENTRY);
+ /* Disable interrupt and synchronize the IRQ */
+ disable_irq(fbd->fw_msix_vector);
- /* Free the vector */
- free_irq(fbd->fw_msix_vector, fbd);
+ /* Mask the vector */
+ fbnic_wr32(fbd, FBNIC_INTR_MASK_SET(0), 1u << FBNIC_FW_MSIX_ENTRY);
/* Make sure disabling logs message is sent, must be done here to
* avoid risk of completing without a running interrupt.
*/
fbnic_mbx_flush_tx(fbd);
-
- /* Reset the mailboxes to the initialized state */
fbnic_mbx_clean(fbd);
}
+/**
+ * fbnic_fw_free_mbx - Disable mailbox and place it in standby state
+ * @fbd: Pointer to device to disable
+ *
+ * This function will disable the mailbox interrupt, free any messages still
+ * in the mailbox and place it into a disabled state. The firmware is
+ * expected to see the update and assume that the host is in the reset state.
+ **/
+void fbnic_fw_free_mbx(struct fbnic_dev *fbd)
+{
+ /* Vector has already been freed */
+ if (!fbd->fw_msix_vector)
+ return;
+
+ fbnic_fw_disable_mbx(fbd);
+
+ /* Free the vector */
+ free_irq(fbd->fw_msix_vector, fbd);
+ fbd->fw_msix_vector = 0;
+}
+
static irqreturn_t fbnic_pcs_msix_intr(int __always_unused irq, void *data)
{
struct fbnic_dev *fbd = data;
@@ -101,7 +137,7 @@ static irqreturn_t fbnic_pcs_msix_intr(int __always_unused irq, void *data)
}
/**
- * fbnic_pcs_irq_enable - Configure the MAC to enable it to advertise link
+ * fbnic_pcs_request_irq - Configure the PCS to enable it to advertise link
* @fbd: Pointer to device to initialize
*
* This function provides basic bringup for the MAC/PCS IRQ. For now the IRQ
@@ -109,41 +145,61 @@ static irqreturn_t fbnic_pcs_msix_intr(int __always_unused irq, void *data)
*
* Return: non-zero on failure.
**/
-int fbnic_pcs_irq_enable(struct fbnic_dev *fbd)
+int fbnic_pcs_request_irq(struct fbnic_dev *fbd)
{
- u32 vector = fbd->pcs_msix_vector;
- int err;
+ struct pci_dev *pdev = to_pci_dev(fbd->dev);
+ int vector, err;
- /* Request the IRQ for MAC link vector.
- * Map MAC cause to it, and unmask it
+ WARN_ON(fbd->pcs_msix_vector);
+
+ vector = pci_irq_vector(pdev, FBNIC_PCS_MSIX_ENTRY);
+ if (vector < 0)
+ return vector;
+
+ /* Request the IRQ for PCS link vector.
+ * Map PCS cause to it, and unmask it
*/
err = request_irq(vector, &fbnic_pcs_msix_intr, 0,
fbd->netdev->name, fbd);
if (err)
return err;
+ /* Map and enable interrupt, unmask vector after link is configured */
fbnic_wr32(fbd, FBNIC_INTR_MSIX_CTRL(FBNIC_INTR_MSIX_CTRL_PCS_IDX),
FBNIC_PCS_MSIX_ENTRY | FBNIC_INTR_MSIX_CTRL_ENABLE);
+ fbd->pcs_msix_vector = vector;
+
return 0;
}
/**
- * fbnic_pcs_irq_disable - Teardown the MAC IRQ to prepare for stopping
+ * fbnic_pcs_free_irq - Teardown the PCS IRQ to prepare for stopping
* @fbd: Pointer to device that is stopping
*
- * This function undoes the work done in fbnic_pcs_irq_enable and prepares
+ * This function undoes the work done in fbnic_pcs_request_irq and prepares
* the device to no longer receive traffic on the host interface.
**/
-void fbnic_pcs_irq_disable(struct fbnic_dev *fbd)
+void fbnic_pcs_free_irq(struct fbnic_dev *fbd)
{
+ /* Vector has already been freed */
+ if (!fbd->pcs_msix_vector)
+ return;
+
/* Disable interrupt */
fbnic_wr32(fbd, FBNIC_INTR_MSIX_CTRL(FBNIC_INTR_MSIX_CTRL_PCS_IDX),
FBNIC_PCS_MSIX_ENTRY);
+ fbnic_wrfl(fbd);
+
+ /* Synchronize IRQ to prevent race that would unmask vector */
+ synchronize_irq(fbd->pcs_msix_vector);
+
+ /* Mask the vector */
fbnic_wr32(fbd, FBNIC_INTR_MASK_SET(0), 1u << FBNIC_PCS_MSIX_ENTRY);
/* Free the vector */
free_irq(fbd->pcs_msix_vector, fbd);
+ fbd->pcs_msix_vector = 0;
}
void fbnic_synchronize_irq(struct fbnic_dev *fbd, int nr)
@@ -226,9 +282,6 @@ void fbnic_free_irqs(struct fbnic_dev *fbd)
{
struct pci_dev *pdev = to_pci_dev(fbd->dev);
- fbd->pcs_msix_vector = 0;
- fbd->fw_msix_vector = 0;
-
fbd->num_irqs = 0;
pci_free_irq_vectors(pdev);
@@ -254,8 +307,5 @@ int fbnic_alloc_irqs(struct fbnic_dev *fbd)
fbd->num_irqs = num_irqs;
- fbd->pcs_msix_vector = pci_irq_vector(pdev, FBNIC_PCS_MSIX_ENTRY);
- fbd->fw_msix_vector = pci_irq_vector(pdev, FBNIC_FW_MSIX_ENTRY);
-
return 0;
}
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
index 79a01fdd1dd1..2524d9b88d59 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_netdev.c
@@ -44,9 +44,10 @@ int __fbnic_open(struct fbnic_net *fbn)
if (err)
goto time_stop;
- err = fbnic_pcs_irq_enable(fbd);
+ err = fbnic_pcs_request_irq(fbd);
if (err)
goto time_stop;
+
/* Pull the BMC config and initialize the RPC */
fbnic_bmc_rpc_init(fbd);
fbnic_rss_reinit(fbd, fbn);
@@ -82,7 +83,7 @@ static int fbnic_stop(struct net_device *netdev)
struct fbnic_net *fbn = netdev_priv(netdev);
fbnic_down(fbn);
- fbnic_pcs_irq_disable(fbn->fbd);
+ fbnic_pcs_free_irq(fbn->fbd);
fbnic_time_stop(fbn);
fbnic_fw_xmit_ownership_msg(fbn->fbd, false);
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
index 6cbbc2ee3e1f..4e8595239c0f 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
@@ -283,7 +283,7 @@ static int fbnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
goto free_irqs;
}
- err = fbnic_fw_enable_mbx(fbd);
+ err = fbnic_fw_request_mbx(fbd);
if (err) {
dev_err(&pdev->dev,
"Firmware mailbox initialization failure\n");
@@ -363,7 +363,7 @@ static void fbnic_remove(struct pci_dev *pdev)
fbnic_hwmon_unregister(fbd);
fbnic_dbg_fbd_exit(fbd);
fbnic_devlink_unregister(fbd);
- fbnic_fw_disable_mbx(fbd);
+ fbnic_fw_free_mbx(fbd);
fbnic_free_irqs(fbd);
fbnic_devlink_free(fbd);
@@ -387,7 +387,7 @@ static int fbnic_pm_suspend(struct device *dev)
rtnl_unlock();
null_uc_addr:
- fbnic_fw_disable_mbx(fbd);
+ fbnic_fw_free_mbx(fbd);
/* Free the IRQs so they aren't trying to occupy sleeping CPUs */
fbnic_free_irqs(fbd);
@@ -420,7 +420,7 @@ static int __fbnic_pm_resume(struct device *dev)
fbd->mac->init_regs(fbd);
/* Re-enable mailbox */
- err = fbnic_fw_enable_mbx(fbd);
+ err = fbnic_fw_request_mbx(fbd);
if (err)
goto err_free_irqs;
@@ -438,15 +438,15 @@ static int __fbnic_pm_resume(struct device *dev)
if (netif_running(netdev)) {
err = __fbnic_open(fbn);
if (err)
- goto err_disable_mbx;
+ goto err_free_mbx;
}
rtnl_unlock();
return 0;
-err_disable_mbx:
+err_free_mbx:
rtnl_unlock();
- fbnic_fw_disable_mbx(fbd);
+ fbnic_fw_free_mbx(fbd);
err_free_irqs:
fbnic_free_irqs(fbd);
err_invalidate_uc_addr:
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net PATCH 4/6] fbnic: Actually flush_tx instead of stalling out
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
` (2 preceding siblings ...)
2025-05-01 23:30 ` [net PATCH 3/6] fbnic: Add additional handling of IRQs Alexander Duyck
@ 2025-05-01 23:30 ` Alexander Duyck
2025-05-02 10:54 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 5/6] fbnic: Cleanup handling of completions Alexander Duyck
2025-05-01 23:30 ` [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context Alexander Duyck
5 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:30 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
From: Alexander Duyck <alexanderduyck@fb.com>
The fbnic_mbx_flush_tx function had a number of issues.
First, we were waiting 200ms for the firmware to process the packets. We
can drop this to 20ms and in almost all cases this should be more than
enough time. So by changing this we can significantly reduce shutdown time.
Second, we were not making sure that the Tx path was actually shut off. As
such we could still have packets added while we were flushing the mailbox.
To prevent that we can now clear the ready flag for the Tx side and it
should stay down since the interrupt is disabled.
Third, we kept re-reading the tail due to the second issue. The tail should
not move after we have started the flush so we can just read it once while
we are holding the mailbox Tx lock. By doing that we are guaranteed that
the value should be consistent.
Fourth, we were keeping a count of descriptors cleaned due to the second
and third issues called out. That count is not a valid reason to be exiting
the cleanup, and with the tail only being read once we shouldn't see any
cases where the tail moves after the disable so the tracking of count can
be dropped.
Fifth, we were using attempts * sleep time to determine how long we would
wait in our polling loop to flush out the Tx. This can be very imprecise.
In order to tighten up the timing we are shifting over to using a jiffies
value of jiffies + 10 * HZ + 1 to determine the jiffies value we should
stop polling at as this should be accurate within once sleep cycle for the
total amount of time spent polling.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 31 ++++++++++++++--------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
index f4749bfd840c..d019191d6ae9 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
@@ -930,35 +930,36 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
void fbnic_mbx_flush_tx(struct fbnic_dev *fbd)
{
+ unsigned long timeout = jiffies + 10 * HZ + 1;
struct fbnic_fw_mbx *tx_mbx;
- int attempts = 50;
- u8 count = 0;
-
- /* Nothing to do if there is no mailbox */
- if (!fbnic_fw_present(fbd))
- return;
+ u8 tail;
/* Record current Rx stats */
tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX];
- /* Nothing to do if mailbox never got to ready */
- if (!tx_mbx->ready)
- return;
+ spin_lock_irq(&fbd->fw_tx_lock);
+
+ /* Clear ready to prevent any further attempts to transmit */
+ tx_mbx->ready = false;
+
+ /* Read tail to determine the last tail state for the ring */
+ tail = tx_mbx->tail;
+
+ spin_unlock_irq(&fbd->fw_tx_lock);
/* Give firmware time to process packet,
- * we will wait up to 10 seconds which is 50 waits of 200ms.
+ * we will wait up to 10 seconds which is 500 waits of 20ms.
*/
do {
u8 head = tx_mbx->head;
- if (head == tx_mbx->tail)
+ /* Tx ring is empty once head == tail */
+ if (head == tail)
break;
- msleep(200);
+ msleep(20);
fbnic_mbx_process_tx_msgs(fbd);
-
- count += (tx_mbx->head - head) % FBNIC_IPC_MBX_DESC_LEN;
- } while (count < FBNIC_IPC_MBX_DESC_LEN && --attempts);
+ } while (time_is_after_jiffies(timeout));
}
void fbnic_get_fw_ver_commit_str(struct fbnic_dev *fbd, char *fw_version,
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net PATCH 5/6] fbnic: Cleanup handling of completions
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
` (3 preceding siblings ...)
2025-05-01 23:30 ` [net PATCH 4/6] fbnic: Actually flush_tx instead of stalling out Alexander Duyck
@ 2025-05-01 23:30 ` Alexander Duyck
2025-05-02 10:45 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context Alexander Duyck
5 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:30 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
From: Alexander Duyck <alexanderduyck@fb.com>
There was an issue in that if we were to shutdown we could be left with
a completion in flight as the mailbox went away. To address that I have
added an fbnic_mbx_evict_all_cmpl function that is meant to essentially
create a "broken pipe" type response so that all callers will receive an
error indicating that the connection has been broken as a result of us
shutting down the mailbox.
In addition the naming was inconsistent between the creation and clearing
of completions. Since the cmpl seems to be the common suffix to use for the
handling of cmpl_data I went through and renamed fbnic_fw_clear_compl to
fbnic_fw_clear_cmpl.
Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 22 +++++++++++++++++++++-
drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 2 +-
drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 2 +-
3 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
index d019191d6ae9..efc0176f1a9a 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
@@ -928,6 +928,23 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
return attempts ? 0 : -ETIMEDOUT;
}
+static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data)
+{
+ cmpl_data->result = -EPIPE;
+ complete(&cmpl_data->done);
+}
+
+static void fbnic_mbx_evict_all_cmpl(struct fbnic_dev *fbd)
+{
+ struct fbnic_fw_completion *cmpl_data;
+
+ cmpl_data = fbd->cmpl_data;
+ if (cmpl_data)
+ __fbnic_fw_evict_cmpl(cmpl_data);
+
+ memset(&fbd->cmpl_data, 0, sizeof(fbd->cmpl_data));
+}
+
void fbnic_mbx_flush_tx(struct fbnic_dev *fbd)
{
unsigned long timeout = jiffies + 10 * HZ + 1;
@@ -945,6 +962,9 @@ void fbnic_mbx_flush_tx(struct fbnic_dev *fbd)
/* Read tail to determine the last tail state for the ring */
tail = tx_mbx->tail;
+ /* Flush any completions as we are no longer processing Rx */
+ fbnic_mbx_evict_all_cmpl(fbd);
+
spin_unlock_irq(&fbd->fw_tx_lock);
/* Give firmware time to process packet,
@@ -983,7 +1003,7 @@ void fbnic_fw_init_cmpl(struct fbnic_fw_completion *fw_cmpl,
kref_init(&fw_cmpl->ref_count);
}
-void fbnic_fw_clear_compl(struct fbnic_dev *fbd)
+void fbnic_fw_clear_cmpl(struct fbnic_dev *fbd)
{
unsigned long flags;
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.h b/drivers/net/ethernet/meta/fbnic/fbnic_fw.h
index a3618e7826c2..2d5e0ff1982c 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.h
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.h
@@ -69,7 +69,7 @@ int fbnic_fw_xmit_tsene_read_msg(struct fbnic_dev *fbd,
struct fbnic_fw_completion *cmpl_data);
void fbnic_fw_init_cmpl(struct fbnic_fw_completion *cmpl_data,
u32 msg_type);
-void fbnic_fw_clear_compl(struct fbnic_dev *fbd);
+void fbnic_fw_clear_cmpl(struct fbnic_dev *fbd);
void fbnic_fw_put_cmpl(struct fbnic_fw_completion *cmpl_data);
#define fbnic_mk_full_fw_ver_str(_rev_id, _delim, _commit, _str, _str_sz) \
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
index dde4a37116e2..7e54f82535f6 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
@@ -744,7 +744,7 @@ static int fbnic_mac_get_sensor_asic(struct fbnic_dev *fbd, int id,
*val = *sensor;
exit_cleanup:
- fbnic_fw_clear_compl(fbd);
+ fbnic_fw_clear_cmpl(fbd);
exit_free:
fbnic_fw_put_cmpl(fw_cmpl);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
` (4 preceding siblings ...)
2025-05-01 23:30 ` [net PATCH 5/6] fbnic: Cleanup handling of completions Alexander Duyck
@ 2025-05-01 23:30 ` Alexander Duyck
2025-05-02 16:54 ` Simon Horman
5 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-01 23:30 UTC (permalink / raw)
To: netdev; +Cc: davem, kuba, pabeni
From: Alexander Duyck <alexanderduyck@fb.com>
This change pulls the call to fbnic_fw_xmit_cap_msg out of
fbnic_mbx_init_desc_ring and instead places it in the polling function for
getting the Tx ready. Doing that we can avoid the potential issue with an
interrupt coming in later from the firmware that causes it to get fired in
interrupt context.
In addition we can add additional verification to the poll_tx_ready
function to make sure that the mailbox is actually ready by verifying that
it has populated the capabilities from the firmware. This is important as
the link config relies on this and we were currently delaying this until
the open call was made which would force the capbabilities message to be
processed then. This resolves potential issues with the link state being
inconsistent between the netdev being registered and the open call being
made.
Lastly we can make the overall mailbox poll-to-ready more
reliable/responsive by reducing the overall sleep time and using a jiffies
based timeout method instead of relying on X number of sleeps/"attempts".
Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence")
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
---
drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 98 +++++++++++++++-------------
1 file changed, 51 insertions(+), 47 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
index efc0176f1a9a..0452ea573d4a 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
@@ -95,6 +95,9 @@ void fbnic_mbx_init(struct fbnic_dev *fbd)
/* Initialize lock to protect Tx ring */
spin_lock_init(&fbd->fw_tx_lock);
+ /* Reset FW Capabilities */
+ memset(&fbd->fw_cap, 0, sizeof(fbd->fw_cap));
+
/* Reinitialize mailbox memory */
for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++)
memset(&fbd->mbx[i], 0, sizeof(struct fbnic_fw_mbx));
@@ -352,32 +355,10 @@ static int fbnic_fw_xmit_simple_msg(struct fbnic_dev *fbd, u32 msg_type)
return err;
}
-/**
- * fbnic_fw_xmit_cap_msg - Allocate and populate a FW capabilities message
- * @fbd: FBNIC device structure
- *
- * Return: NULL on failure to allocate, error pointer on error, or pointer
- * to new TLV test message.
- *
- * Sends a single TLV header indicating the host wants the firmware to
- * confirm the capabilities and version.
- **/
-static int fbnic_fw_xmit_cap_msg(struct fbnic_dev *fbd)
-{
- int err = fbnic_fw_xmit_simple_msg(fbd, FBNIC_TLV_MSG_ID_HOST_CAP_REQ);
-
- /* Return 0 if we are not calling this on ASIC */
- return (err == -EOPNOTSUPP) ? 0 : err;
-}
-
static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
{
struct fbnic_fw_mbx *mbx = &fbd->mbx[mbx_idx];
- /* This is a one time init, so just exit if it is completed */
- if (mbx->ready)
- return;
-
mbx->ready = true;
switch (mbx_idx) {
@@ -393,34 +374,22 @@ static void fbnic_mbx_init_desc_ring(struct fbnic_dev *fbd, int mbx_idx)
/* Enable DMA reads from the device */
wr32(fbd, FBNIC_PUL_OB_TLP_HDR_AR_CFG,
FBNIC_PUL_OB_TLP_HDR_AR_CFG_BME);
-
- /* Force version to 1 if we successfully requested an update
- * from the firmware. This should be overwritten once we get
- * the actual version from the firmware in the capabilities
- * request message.
- */
- if (!fbnic_fw_xmit_cap_msg(fbd) &&
- !fbd->fw_cap.running.mgmt.version)
- fbd->fw_cap.running.mgmt.version = 1;
break;
}
}
-static void fbnic_mbx_postinit(struct fbnic_dev *fbd)
+static bool fbnic_mbx_event(struct fbnic_dev *fbd)
{
- int i;
-
/* We only need to do this on the first interrupt following reset.
* this primes the mailbox so that we will have cleared all the
* skip descriptors.
*/
if (!(rd32(fbd, FBNIC_INTR_STATUS(0)) & (1u << FBNIC_FW_MSIX_ENTRY)))
- return;
+ return false;
wr32(fbd, FBNIC_INTR_CLEAR(0), 1u << FBNIC_FW_MSIX_ENTRY);
- for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++)
- fbnic_mbx_init_desc_ring(fbd, i);
+ return true;
}
/**
@@ -897,7 +866,7 @@ static void fbnic_mbx_process_rx_msgs(struct fbnic_dev *fbd)
void fbnic_mbx_poll(struct fbnic_dev *fbd)
{
- fbnic_mbx_postinit(fbd);
+ fbnic_mbx_event(fbd);
fbnic_mbx_process_tx_msgs(fbd);
fbnic_mbx_process_rx_msgs(fbd);
@@ -905,27 +874,62 @@ void fbnic_mbx_poll(struct fbnic_dev *fbd)
int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
{
- struct fbnic_fw_mbx *tx_mbx;
- int attempts = 50;
+ struct fbnic_fw_mbx *tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX];
+ unsigned long timeout = jiffies + 10 * HZ + 1;
+ int err, i;
- /* Immediate fail if BAR4 isn't there */
- if (!fbnic_fw_present(fbd))
- return -ENODEV;
+ do {
+ if (!time_is_after_jiffies(timeout))
+ return -ETIMEDOUT;
- tx_mbx = &fbd->mbx[FBNIC_IPC_MBX_TX_IDX];
- while (!tx_mbx->ready && --attempts) {
/* Force the firmware to trigger an interrupt response to
* avoid the mailbox getting stuck closed if the interrupt
* is reset.
*/
fbnic_mbx_reset_desc_ring(fbd, FBNIC_IPC_MBX_TX_IDX);
- msleep(200);
+ /* Immediate fail if BAR4 went away */
+ if (!fbnic_fw_present(fbd))
+ return -ENODEV;
+
+ msleep(20);
+ } while (!fbnic_mbx_event(fbd));
+
+ /* FW has shown signs of life. Enable DMA and start Tx/Rx */
+ for (i = 0; i < FBNIC_IPC_MBX_INDICES; i++)
+ fbnic_mbx_init_desc_ring(fbd, i);
+ /* Request an update from the firmware. This should overwrite
+ * mgmt.version once we get the actual version from the firmware
+ * in the capabilities request message.
+ */
+ err = fbnic_fw_xmit_simple_msg(fbd, FBNIC_TLV_MSG_ID_HOST_CAP_REQ);
+ if (err)
+ goto clean_mbx;
+
+ /* Poll until we get a current management firmware version, use "1"
+ * to indicate we entered the polling state waiting for a response
+ */
+ for (fbd->fw_cap.running.mgmt.version = 1;
+ fbd->fw_cap.running.mgmt.version < MIN_FW_VERSION_CODE;) {
+ if (!tx_mbx->ready)
+ err = -ENODEV;
+ if (err)
+ goto clean_mbx;
+
+ msleep(20);
fbnic_mbx_poll(fbd);
+
+ /* set err, but wait till mgmt.version check to report it */
+ if (!time_is_after_jiffies(timeout))
+ err = -ETIMEDOUT;
}
- return attempts ? 0 : -ETIMEDOUT;
+ return 0;
+clean_mbx:
+ /* Cleanup Rx buffers and disable mailbox */
+ fbnic_mbx_clean(fbd);
+ return err;
}
static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data)
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [net PATCH 5/6] fbnic: Cleanup handling of completions
2025-05-01 23:30 ` [net PATCH 5/6] fbnic: Cleanup handling of completions Alexander Duyck
@ 2025-05-02 10:45 ` Simon Horman
2025-05-04 14:37 ` Alexander Duyck
0 siblings, 1 reply; 16+ messages in thread
From: Simon Horman @ 2025-05-02 10:45 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Thu, May 01, 2025 at 04:30:22PM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@fb.com>
>
> There was an issue in that if we were to shutdown we could be left with
> a completion in flight as the mailbox went away. To address that I have
> added an fbnic_mbx_evict_all_cmpl function that is meant to essentially
> create a "broken pipe" type response so that all callers will receive an
> error indicating that the connection has been broken as a result of us
> shutting down the mailbox.
>
> In addition the naming was inconsistent between the creation and clearing
> of completions. Since the cmpl seems to be the common suffix to use for the
> handling of cmpl_data I went through and renamed fbnic_fw_clear_compl to
> fbnic_fw_clear_cmpl.
I do see this is somehow related to the fix described in the first
paragraph. But I don't think renaming functions like this is appropriate
for net.
> Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests")
>
> Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
> ---
> drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 22 +++++++++++++++++++++-
> drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 2 +-
> drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 2 +-
> 3 files changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> index d019191d6ae9..efc0176f1a9a 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> @@ -928,6 +928,23 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
> return attempts ? 0 : -ETIMEDOUT;
> }
>
> +static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data)
> +{
> + cmpl_data->result = -EPIPE;
> + complete(&cmpl_data->done);
> +}
> +
> +static void fbnic_mbx_evict_all_cmpl(struct fbnic_dev *fbd)
> +{
> + struct fbnic_fw_completion *cmpl_data;
> +
> + cmpl_data = fbd->cmpl_data;
> + if (cmpl_data)
> + __fbnic_fw_evict_cmpl(cmpl_data);
> +
> + memset(&fbd->cmpl_data, 0, sizeof(fbd->cmpl_data));
Maybe I've been staring at my screen for too long, but could this
be expressed more succinctly as:
fbd->cmpl_data = NULL;
And if so, it seems that step can be skipped if cmpl_data is already
NULL, which is already checked.
Leading to the following, which is somehow easier on my brain.
static void fbnic_mbx_evict_all_cmpl(struct fbnic_dev *fbd)
{
if (fbd->cmpl_data) {
__fbnic_fw_evict_cmpl(fbd->cmpl_data);
fbd->cmpl_data = NULL;
}
}
> +}
> +
> void fbnic_mbx_flush_tx(struct fbnic_dev *fbd)
> {
> unsigned long timeout = jiffies + 10 * HZ + 1;
> @@ -945,6 +962,9 @@ void fbnic_mbx_flush_tx(struct fbnic_dev *fbd)
> /* Read tail to determine the last tail state for the ring */
> tail = tx_mbx->tail;
>
> + /* Flush any completions as we are no longer processing Rx */
> + fbnic_mbx_evict_all_cmpl(fbd);
> +
> spin_unlock_irq(&fbd->fw_tx_lock);
>
> /* Give firmware time to process packet,
> @@ -983,7 +1003,7 @@ void fbnic_fw_init_cmpl(struct fbnic_fw_completion *fw_cmpl,
> kref_init(&fw_cmpl->ref_count);
> }
>
> -void fbnic_fw_clear_compl(struct fbnic_dev *fbd)
> +void fbnic_fw_clear_cmpl(struct fbnic_dev *fbd)
> {
> unsigned long flags;
>
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.h b/drivers/net/ethernet/meta/fbnic/fbnic_fw.h
> index a3618e7826c2..2d5e0ff1982c 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.h
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.h
> @@ -69,7 +69,7 @@ int fbnic_fw_xmit_tsene_read_msg(struct fbnic_dev *fbd,
> struct fbnic_fw_completion *cmpl_data);
> void fbnic_fw_init_cmpl(struct fbnic_fw_completion *cmpl_data,
> u32 msg_type);
> -void fbnic_fw_clear_compl(struct fbnic_dev *fbd);
> +void fbnic_fw_clear_cmpl(struct fbnic_dev *fbd);
> void fbnic_fw_put_cmpl(struct fbnic_fw_completion *cmpl_data);
>
> #define fbnic_mk_full_fw_ver_str(_rev_id, _delim, _commit, _str, _str_sz) \
> diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
> index dde4a37116e2..7e54f82535f6 100644
> --- a/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
> +++ b/drivers/net/ethernet/meta/fbnic/fbnic_mac.c
> @@ -744,7 +744,7 @@ static int fbnic_mac_get_sensor_asic(struct fbnic_dev *fbd, int id,
>
> *val = *sensor;
> exit_cleanup:
> - fbnic_fw_clear_compl(fbd);
> + fbnic_fw_clear_cmpl(fbd);
> exit_free:
> fbnic_fw_put_cmpl(fw_cmpl);
>
>
>
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings
2025-05-01 23:29 ` [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings Alexander Duyck
@ 2025-05-02 10:49 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2025-05-02 10:49 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Thu, May 01, 2025 at 04:29:57PM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@fb.com>
>
> Address to issues with the FW mailbox descriptor initialization.
>
> We need to reverse the order of accesses when we invalidate an entry versus
> writing an entry. When writing an entry we write upper and then lower as
> the lower 32b contain the valid bit that makes the entire address valid.
> However for invalidation we should write it in the reverse order so that
> the upper is marked invalid before we update it.
>
> Without this change we may see FW attempt to access pages with the upper
> 32b of the address set to 0 which will likely result in DMAR faults due to
> write access failures on mailbox shutdown.
>
> Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
nit: No blank line here please.
Likewise in other patches in this series.
>
> Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
The nit above aside, this looks good to me.
Reviewed-by: Simon Horman <horms@kernel.org>
...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 2/6] fbnic: Gate AXI read/write enabling on FW mailbox
2025-05-01 23:30 ` [net PATCH 2/6] fbnic: Gate AXI read/write enabling on FW mailbox Alexander Duyck
@ 2025-05-02 10:50 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2025-05-02 10:50 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Thu, May 01, 2025 at 04:30:03PM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@fb.com>
>
> In order to prevent the device from throwing spurious writes and/or reads
> at us we need to gate the AXI fabric interface to the PCIe until such time
> as we know the FW is in a known good state.
>
> To accomplish this we use the mailbox as a mechanism for us to recognize
> that the FW has acknowledged our presence and is no longer sending any
> stale message data to us.
>
> We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function,
> disabling the DMA in both directions, and then invalidating all the
> descriptors in each ring.
>
> We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt
> is set by the FW we pick it up and mark the mailboxes as ready, while also
> enabling the DMA.
>
> Once we have completed all the transactions and need to shut down we call
> into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for
> each ring and shut down the DMA and once again invalidate the descriptors.
>
> Fixes: 3646153161f1 ("eth: fbnic: Add register init to set PCIe/Ethernet device config")
> Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
nit: No blank linke here please.
> Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
The nit above aside, this looks good to me.
Reviewed-by: Simon Horman <horms@kernel.org>
...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 4/6] fbnic: Actually flush_tx instead of stalling out
2025-05-01 23:30 ` [net PATCH 4/6] fbnic: Actually flush_tx instead of stalling out Alexander Duyck
@ 2025-05-02 10:54 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2025-05-02 10:54 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Thu, May 01, 2025 at 04:30:16PM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@fb.com>
>
> The fbnic_mbx_flush_tx function had a number of issues.
>
> First, we were waiting 200ms for the firmware to process the packets. We
> can drop this to 20ms and in almost all cases this should be more than
> enough time. So by changing this we can significantly reduce shutdown time.
>
> Second, we were not making sure that the Tx path was actually shut off. As
> such we could still have packets added while we were flushing the mailbox.
> To prevent that we can now clear the ready flag for the Tx side and it
> should stay down since the interrupt is disabled.
>
> Third, we kept re-reading the tail due to the second issue. The tail should
> not move after we have started the flush so we can just read it once while
> we are holding the mailbox Tx lock. By doing that we are guaranteed that
> the value should be consistent.
>
> Fourth, we were keeping a count of descriptors cleaned due to the second
> and third issues called out. That count is not a valid reason to be exiting
> the cleanup, and with the tail only being read once we shouldn't see any
> cases where the tail moves after the disable so the tracking of count can
> be dropped.
>
> Fifth, we were using attempts * sleep time to determine how long we would
> wait in our polling loop to flush out the Tx. This can be very imprecise.
> In order to tighten up the timing we are shifting over to using a jiffies
> value of jiffies + 10 * HZ + 1 to determine the jiffies value we should
> stop polling at as this should be accurate within once sleep cycle for the
> total amount of time spent polling.
>
> Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
nit: No blank line here
>
> Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
The nit above aside, this looks good to me.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 3/6] fbnic: Add additional handling of IRQs
2025-05-01 23:30 ` [net PATCH 3/6] fbnic: Add additional handling of IRQs Alexander Duyck
@ 2025-05-02 13:51 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2025-05-02 13:51 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Thu, May 01, 2025 at 04:30:10PM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@fb.com>
>
> We have two issues that need to be addressed in our IRQ handling.
>
> One is the fact that we can end up double-freeing IRQs in the event of an
> exception handling error such as a PCIe reset/recovery that fails. To
> prevent that from becoming an issue we can use the msix_vector values to
> indicate that we have successfully requested/freed the IRQ by only setting
> or clearing them when we have completed the tiven action.
nit: given
>
> The other issue is that we have several potential races in our IRQ path due
> to us manipulating the mask before the vector has been truly disabled. In
> order to handle that in the case of the FW mailbox we need to not
> auto-enable the IRQ and instead will be enabling/disabling it separately.
> In the case of the PCS vector we can mitigate this by unmapping it and
> synchronizing the IRQ before we clear the mask.
>
> The general order of operations after this change is now to request the
> interrupt, poll the FW mailbox to ready, and then enable the interrupt. For
> the shutdown we do the reverse where we disable the interrupt, flush any
> pending Tx, and then free the IRQ. I am renaming the enable/disable to
> request/free to be equivilent with the IRQ calls being used. We may see
> additions in the future to enable/disable the IRQs versus request/free them
> for certain use cases.
Thanks for the nice explanation. And likewise throughout this series.
It's much appreciated.
>
> Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
> Fixes: 69684376eed5 ("eth: fbnic: Add link detection")
> Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
2025-05-01 23:30 ` [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context Alexander Duyck
@ 2025-05-02 16:54 ` Simon Horman
2025-05-04 14:53 ` Alexander Duyck
0 siblings, 1 reply; 16+ messages in thread
From: Simon Horman @ 2025-05-02 16:54 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Thu, May 01, 2025 at 04:30:30PM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexanderduyck@fb.com>
>
> This change pulls the call to fbnic_fw_xmit_cap_msg out of
> fbnic_mbx_init_desc_ring and instead places it in the polling function for
> getting the Tx ready. Doing that we can avoid the potential issue with an
> interrupt coming in later from the firmware that causes it to get fired in
> interrupt context.
>
> In addition we can add additional verification to the poll_tx_ready
> function to make sure that the mailbox is actually ready by verifying that
> it has populated the capabilities from the firmware. This is important as
> the link config relies on this and we were currently delaying this until
> the open call was made which would force the capbabilities message to be
> processed then. This resolves potential issues with the link state being
> inconsistent between the netdev being registered and the open call being
> made.
>
> Lastly we can make the overall mailbox poll-to-ready more
> reliable/responsive by reducing the overall sleep time and using a jiffies
> based timeout method instead of relying on X number of sleeps/"attempts".
This patch really feels like it ought to be three patches.
Perhaps that comment applies to other patches in this series,
but this one seems to somehow stand out in that regard.
>
> Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence")
>
> Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 5/6] fbnic: Cleanup handling of completions
2025-05-02 10:45 ` Simon Horman
@ 2025-05-04 14:37 ` Alexander Duyck
0 siblings, 0 replies; 16+ messages in thread
From: Alexander Duyck @ 2025-05-04 14:37 UTC (permalink / raw)
To: Simon Horman; +Cc: netdev, davem, kuba, pabeni
On Fri, May 2, 2025 at 3:45 AM Simon Horman <horms@kernel.org> wrote:
>
> On Thu, May 01, 2025 at 04:30:22PM -0700, Alexander Duyck wrote:
> > From: Alexander Duyck <alexanderduyck@fb.com>
> >
> > There was an issue in that if we were to shutdown we could be left with
> > a completion in flight as the mailbox went away. To address that I have
> > added an fbnic_mbx_evict_all_cmpl function that is meant to essentially
> > create a "broken pipe" type response so that all callers will receive an
> > error indicating that the connection has been broken as a result of us
> > shutting down the mailbox.
> >
> > In addition the naming was inconsistent between the creation and clearing
> > of completions. Since the cmpl seems to be the common suffix to use for the
> > handling of cmpl_data I went through and renamed fbnic_fw_clear_compl to
> > fbnic_fw_clear_cmpl.
>
> I do see this is somehow related to the fix described in the first
> paragraph. But I don't think renaming functions like this is appropriate
> for net.
I can drop this. As it stands we have a follow-up patch for net-next
which I had split these fixes from so we can most likely leave it for
there.
> > Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests")
> >
> > Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
> > ---
> > drivers/net/ethernet/meta/fbnic/fbnic_fw.c | 22 +++++++++++++++++++++-
> > drivers/net/ethernet/meta/fbnic/fbnic_fw.h | 2 +-
> > drivers/net/ethernet/meta/fbnic/fbnic_mac.c | 2 +-
> > 3 files changed, 23 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> > index d019191d6ae9..efc0176f1a9a 100644
> > --- a/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> > +++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw.c
> > @@ -928,6 +928,23 @@ int fbnic_mbx_poll_tx_ready(struct fbnic_dev *fbd)
> > return attempts ? 0 : -ETIMEDOUT;
> > }
> >
> > +static void __fbnic_fw_evict_cmpl(struct fbnic_fw_completion *cmpl_data)
> > +{
> > + cmpl_data->result = -EPIPE;
> > + complete(&cmpl_data->done);
> > +}
> > +
> > +static void fbnic_mbx_evict_all_cmpl(struct fbnic_dev *fbd)
> > +{
> > + struct fbnic_fw_completion *cmpl_data;
> > +
> > + cmpl_data = fbd->cmpl_data;
> > + if (cmpl_data)
> > + __fbnic_fw_evict_cmpl(cmpl_data);
> > +
> > + memset(&fbd->cmpl_data, 0, sizeof(fbd->cmpl_data));
>
> Maybe I've been staring at my screen for too long, but could this
> be expressed more succinctly as:
>
> fbd->cmpl_data = NULL;
>
> And if so, it seems that step can be skipped if cmpl_data is already
> NULL, which is already checked.
>
> Leading to the following, which is somehow easier on my brain.
>
> static void fbnic_mbx_evict_all_cmpl(struct fbnic_dev *fbd)
> {
> if (fbd->cmpl_data) {
> __fbnic_fw_evict_cmpl(fbd->cmpl_data);
> fbd->cmpl_data = NULL;
> }
> }
I'll make the change. There is a follow-on patch for net-next that
will add support for multiple completions. As such the code may roll
back a bit to the original form, but it will make more sense if
contained within a for loop.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
2025-05-02 16:54 ` Simon Horman
@ 2025-05-04 14:53 ` Alexander Duyck
2025-05-06 15:50 ` Simon Horman
0 siblings, 1 reply; 16+ messages in thread
From: Alexander Duyck @ 2025-05-04 14:53 UTC (permalink / raw)
To: Simon Horman; +Cc: netdev, davem, kuba, pabeni
On Fri, May 2, 2025 at 9:54 AM Simon Horman <horms@kernel.org> wrote:
>
> On Thu, May 01, 2025 at 04:30:30PM -0700, Alexander Duyck wrote:
> > From: Alexander Duyck <alexanderduyck@fb.com>
> >
> > This change pulls the call to fbnic_fw_xmit_cap_msg out of
> > fbnic_mbx_init_desc_ring and instead places it in the polling function for
> > getting the Tx ready. Doing that we can avoid the potential issue with an
> > interrupt coming in later from the firmware that causes it to get fired in
> > interrupt context.
> >
> > In addition we can add additional verification to the poll_tx_ready
> > function to make sure that the mailbox is actually ready by verifying that
> > it has populated the capabilities from the firmware. This is important as
> > the link config relies on this and we were currently delaying this until
> > the open call was made which would force the capbabilities message to be
> > processed then. This resolves potential issues with the link state being
> > inconsistent between the netdev being registered and the open call being
> > made.
> >
> > Lastly we can make the overall mailbox poll-to-ready more
> > reliable/responsive by reducing the overall sleep time and using a jiffies
> > based timeout method instead of relying on X number of sleeps/"attempts".
>
> This patch really feels like it ought to be three patches.
> Perhaps that comment applies to other patches in this series,
> but this one seems to somehow stand out in that regard.
Yeah, part of the issue is that these patches all became an exercise
in "flipping rocks". Every time I touched one thing it exposed a bunch
more bugs. I'll try to split this one up a bit more. I should be able
to defer the need for the management version until net-next which will
cut down on the noise.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context
2025-05-04 14:53 ` Alexander Duyck
@ 2025-05-06 15:50 ` Simon Horman
0 siblings, 0 replies; 16+ messages in thread
From: Simon Horman @ 2025-05-06 15:50 UTC (permalink / raw)
To: Alexander Duyck; +Cc: netdev, davem, kuba, pabeni
On Sun, May 04, 2025 at 07:53:09AM -0700, Alexander Duyck wrote:
> On Fri, May 2, 2025 at 9:54 AM Simon Horman <horms@kernel.org> wrote:
> >
> > On Thu, May 01, 2025 at 04:30:30PM -0700, Alexander Duyck wrote:
> > > From: Alexander Duyck <alexanderduyck@fb.com>
> > >
> > > This change pulls the call to fbnic_fw_xmit_cap_msg out of
> > > fbnic_mbx_init_desc_ring and instead places it in the polling function for
> > > getting the Tx ready. Doing that we can avoid the potential issue with an
> > > interrupt coming in later from the firmware that causes it to get fired in
> > > interrupt context.
> > >
> > > In addition we can add additional verification to the poll_tx_ready
> > > function to make sure that the mailbox is actually ready by verifying that
> > > it has populated the capabilities from the firmware. This is important as
> > > the link config relies on this and we were currently delaying this until
> > > the open call was made which would force the capbabilities message to be
> > > processed then. This resolves potential issues with the link state being
> > > inconsistent between the netdev being registered and the open call being
> > > made.
> > >
> > > Lastly we can make the overall mailbox poll-to-ready more
> > > reliable/responsive by reducing the overall sleep time and using a jiffies
> > > based timeout method instead of relying on X number of sleeps/"attempts".
> >
> > This patch really feels like it ought to be three patches.
> > Perhaps that comment applies to other patches in this series,
> > but this one seems to somehow stand out in that regard.
>
> Yeah, part of the issue is that these patches all became an exercise
> in "flipping rocks". Every time I touched one thing it exposed a bunch
> more bugs. I'll try to split this one up a bit more. I should be able
> to defer the need for the management version until net-next which will
> cut down on the noise.
Thanks. I could see that you were working your way through some sort of
rabbit hole situation. And while I certainly don't want to be unreasonable.
If would be nice if you could split this one up a bit. And it would
be a bonus in my view if some bits could be deferred to net-next.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2025-05-06 15:50 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-01 23:29 [net PATCH 0/6] fbnic: FW IPC Mailbox fixes Alexander Duyck
2025-05-01 23:29 ` [net PATCH 1/6] fbnic: Fix initialization of mailbox descriptor rings Alexander Duyck
2025-05-02 10:49 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 2/6] fbnic: Gate AXI read/write enabling on FW mailbox Alexander Duyck
2025-05-02 10:50 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 3/6] fbnic: Add additional handling of IRQs Alexander Duyck
2025-05-02 13:51 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 4/6] fbnic: Actually flush_tx instead of stalling out Alexander Duyck
2025-05-02 10:54 ` Simon Horman
2025-05-01 23:30 ` [net PATCH 5/6] fbnic: Cleanup handling of completions Alexander Duyck
2025-05-02 10:45 ` Simon Horman
2025-05-04 14:37 ` Alexander Duyck
2025-05-01 23:30 ` [net PATCH 6/6] fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context Alexander Duyck
2025-05-02 16:54 ` Simon Horman
2025-05-04 14:53 ` Alexander Duyck
2025-05-06 15:50 ` Simon Horman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).