[PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping

DPDK-dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping
@ 2026-04-27 16:30 Abdulrahman Alshawi
  2026-04-27 16:40 ` [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses Abdulrahman Alshawi
  2026-04-27 16:42 ` [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port Abdulrahman Alshawi
  0 siblings, 2 replies; 5+ messages in thread
From: Abdulrahman Alshawi @ 2026-04-27 16:30 UTC (permalink / raw)
  To: dev; +Cc: bharat, stable

[-- Attachment #1: Type: text/plain, Size: 1445 bytes --]

This series fixes two correctness issues in the cxgbe PMD that can cause
traffic loss on Chelsio T6 adapters, especially when rte_flow QUEUE
rules concentrate ingress on a small set of RX queues. 

Patch 1 fixes packed Rx response handling. The current PMD assumes every
response descriptor starts a new Free List buffer by requiring
F_RSPD_NEWBUF on each response. That assumption does not always hold for
packed ingress responses. Under sustained small-packet traffic to a
single ingress queue, the FL/IQ state goes out of sync and the affected
Rx path stops making forward progress. 

Patch 2 scopes rte_flow rules to the ingress port they were created on.
Chelsio filters are programmed in adapter-wide tables, and the PMD
already supports the iport field in the hardware filter spec. However,
the flow parser never fills it for normal per-port rules, so a rule
created on one port can also match traffic arriving on sibling ports of
the same adapter. 

Both issues reproduce with stock testpmd on T62100-LP-CR. The per-patch
commit messages include the details and reproducers. 

Abdulrahman Alshawi (2):
net/cxgbe: fix Rx handling for packed responses
net/cxgbe: restrict rte_flow rules to ingress port 

.mailmap | 1 +
drivers/net/cxgbe/base/adapter.h | 1 +
drivers/net/cxgbe/cxgbe_flow.c | 19 +++++
drivers/net/cxgbe/sge.c | 122 ++++++++++++++++++++++++-------
4 files changed, 118 insertions(+), 25 deletions(-) 

--
2.39.5   

[-- Attachment #2: Type: text/html, Size: 1841 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses
  2026-04-27 16:30 [PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping Abdulrahman Alshawi
@ 2026-04-27 16:40 ` Abdulrahman Alshawi
  2026-04-28  7:05   ` Abdulrahman Alshawi
  2026-04-27 16:42 ` [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port Abdulrahman Alshawi
  1 sibling, 1 reply; 5+ messages in thread
From: Abdulrahman Alshawi @ 2026-04-27 16:40 UTC (permalink / raw)
  To: dev; +Cc: bharat, stable

[-- Attachment #1: Type: text/plain, Size: 9059 bytes --]

 The Rx path assumes every SGE response starts a new Free List buffer:       BUG_ON(!(len & F_RSPD_NEWBUF));   That is not always true. On T5/T6, small packets can be packed into the same FL buffer. Only the first response for that buffer has NEWBUF set; later responses can refer to the same buffer with a different payload offset.   The current PMD consumes one FL buffer per response. When packed responses are delivered, the FL consumer state goes out of sync with the hardware and the affected ingress queue stops making progress. From user space this shows up as:     - rx_bgN_dropped_packets increasing at line rate   - q_ipackets for the affected queue staying at 0   - imissed increasing   - no recovery until the port is restarted   Fix this by tracking packed-buffer alignment and copying packet payload from the current FL buffer at q->offset, only freeing the FL buffer once it has actually been consumed.   This patch:   - adds sge::fl_align from the ingress pad/pack settings   - introduces cxgbe_copy_rx_pkt() to copy from packed FL buffers   - advances q->offset by len rounded up to fl_align   - frees an FL buffer only after full consumption   - drains the previous buffer when a later response arrives with NEWBUF   Reproduce (T62100-LP-CR, FW 2.1.19.0 / TP 0.1.23.2):     dpdk-testpmd -l 1-9 -a 0000:18:00.4 -- \              --rxq=32 --txq=32 --nb-cores=8 --forward-mode=rxonly -i     testpmd> port stop all   testpmd> flow flush 0   testpmd> flow create 0 ingress pattern eth \            / vlan tci spec 0 tci mask 0x0007 \            / end actions queue index 5 / end   testpmd> port start all   testpmd> start   Without this patch, rx_qN_packets stays at 0 while rx_bgN_dropped_packets rises at line rate. With the patch, rx_qN_packets tracks received traffic and rx_bgN_dropped_packets stays at 0.   Signed-off-by: Abdulrahman Alshawi <ashawi@wirefilter.com> ---  drivers/net/cxgbe/base/adapter.h |   1 +  drivers/net/cxgbe/sge.c          | 122 ++++++++++++++++++++++++-------  2 files changed, 98 insertions(+), 25 deletions(-)   diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h index 207f3ecb88..e67cf22950 100644 --- a/drivers/net/cxgbe/base/adapter.h +++ b/drivers/net/cxgbe/base/adapter.h @@ -280,6 +280,7 @@ struct sge {     u16 max_ethqsets;           /* # of available Ethernet queue sets */     u32 stat_len;               /* length of status page at ring end */     u32 pktshift;               /* padding between CPL & packet data */ +   u32 fl_align;               /* packed Rx packet alignment */       /* response queue interrupt parameters */     u16 timer_val[SGE_NTIMERS]; diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c index e9d45f24c4..7cf5c70775 100644 --- a/drivers/net/cxgbe/sge.c +++ b/drivers/net/cxgbe/sge.c @@ -1492,6 +1492,90 @@ static inline void cxgbe_fill_mbuf_info(struct adapter *adap,                    RTE_MBUF_F_RX_L4_CKSUM_BAD);  }   +static int cxgbe_copy_rx_pkt(struct sge_rspq *q, struct sge_eth_rxq *rxq, +               u32 len, struct rte_mbuf **out) +{ +   struct sge *s = &q->adapter->sge; +   struct rte_mbuf *pkt; +   char *dst; +   u32 copied = 0; +   u32 remaining = len; + +   pkt = rte_pktmbuf_alloc(q->mb_pool); +   if (unlikely(!pkt)) { +       rxq->rspq.eth_dev->data->rx_mbuf_alloc_failed++; +       rxq->stats.rx_drops++; +       return -ENOMEM; +   } + +   dst = rte_pktmbuf_append(pkt, len); +   if (unlikely(!dst)) { +       rte_pktmbuf_free(pkt); +       rxq->stats.rx_drops++; +       return -ENOMEM; +   } + +   while (remaining) { +       const struct rx_sw_desc *rsd = &rxq->fl.sdesc[rxq->fl.cidx]; +       struct rte_mbuf *src = rsd->buf; +       u32 bufsz = get_buf_size(q->adapter, rsd); +       u32 copy_len; + +       if (unlikely(!src || q->offset < 0 || +               (u32)q->offset >= bufsz)) { +           rte_pktmbuf_free(pkt); +           rxq->stats.rx_drops++; +           return -EINVAL; +       } + +       copy_len = RTE_MIN(bufsz - (u32)q->offset, remaining); +       rte_memcpy(dst + copied, +             rte_pktmbuf_mtod_offset(src, const void *, q->offset), +             copy_len); + +       copied += copy_len; +       remaining -= copy_len; +       q->offset += copy_len; + +       if (remaining) { +           free_rx_bufs(&rxq->fl, 1); +           q->offset = 0; +           continue; +       } + +       q->offset = RTE_ALIGN_CEIL(q->offset, s->fl_align); +       if ((u32)q->offset >= bufsz) { +           free_rx_bufs(&rxq->fl, 1); +           q->offset = 0; +       } +   } + +   *out = pkt; +   return 0; +} + +static unsigned int cxgbe_fl_pkt_align(struct adapter *adap) +{ +   u32 sge_control = t4_read_reg(adap, A_SGE_CONTROL); +   unsigned int ingpad_shift, ingpad, fl_align; + +   ingpad_shift = CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5 ? +             X_INGPADBOUNDARY_SHIFT : X_T6_INGPADBOUNDARY_SHIFT; +   ingpad = 1U << (G_INGPADBOUNDARY(sge_control) + ingpad_shift); +   fl_align = ingpad; + +   if (!is_t4(adap->params.chip)) { +       u32 sge_control2 = t4_read_reg(adap, A_SGE_CONTROL2); +       unsigned int ingpack = G_INGPACKBOUNDARY(sge_control2); + +       ingpack = ingpack == X_INGPACKBOUNDARY_16B ? +            16 : 1U << (ingpack + X_INGPACKBOUNDARY_SHIFT); +       fl_align = RTE_MAX(ingpad, ingpack); +   } + +   return fl_align ? fl_align : RTE_CACHE_LINE_SIZE; +} +  /**   * process_responses - process responses from an SGE response queue   * @q: the ingress queue to process @@ -1535,14 +1619,12 @@ static int process_responses(struct sge_rspq *q, int budget,             stat_pidx = ntohs(q->stat->pidx);             stat_pidx_diff = P_IDXDIFF(q, stat_pidx);             while (stat_pidx_diff && budget_left) { -               const struct rx_sw_desc *rsd = -                   &rxq->fl.sdesc[rxq->fl.cidx];                 const struct rss_header *rss_hdr =                     (const void *)q->cur_desc;                 const struct cpl_rx_pkt *cpl =                     (const void *)&q->cur_desc[1]; -               struct rte_mbuf *pkt, *npkt; -               u32 len, bufsz; +               struct rte_mbuf *pkt; +               u32 len;                   rc = (const struct rsp_ctrl *)                     ((const char *)q->cur_desc + @@ -1553,28 +1635,16 @@ static int process_responses(struct sge_rspq *q, int budget,                     break;                   len = ntohl(rc->pldbuflen_qid); -               BUG_ON(!(len & F_RSPD_NEWBUF)); -               pkt = rsd->buf; -               npkt = pkt; -               len = G_RSPD_LEN(len); -               pkt->pkt_len = len; - -               /* Chain mbufs into len if necessary */ -               while (len) { -                   struct rte_mbuf *new_pkt = rsd->buf; - -                   bufsz = min(get_buf_size(q->adapter, -                               rsd), len); -                   new_pkt->data_len = bufsz; -                   unmap_rx_buf(&rxq->fl); -                   len -= bufsz; -                   npkt->next = new_pkt; -                   npkt = new_pkt; -                   pkt->nb_segs++; -                   rsd = &rxq->fl.sdesc[rxq->fl.cidx]; +               if (len & F_RSPD_NEWBUF) { +                   if (q->offset > 0) { +                       free_rx_bufs(&rxq->fl, 1); +                       q->offset = 0; +                   }                 } -               npkt->next = NULL; -               pkt->nb_segs--; +               len = G_RSPD_LEN(len); +               ret = cxgbe_copy_rx_pkt(q, rxq, len, &pkt); +               if (unlikely(ret)) +                   break;                   cxgbe_fill_mbuf_info(q->adapter, cpl, pkt);   @@ -2379,6 +2449,7 @@ int t4_sge_init(struct adapter *adap)     sge_control = t4_read_reg(adap, A_SGE_CONTROL);     s->pktshift = G_PKTSHIFT(sge_control);     s->stat_len = (sge_control & F_EGRSTATUSPAGESIZE) ? 128 : 64; +   s->fl_align = cxgbe_fl_pkt_align(adap);     ret = t4_sge_init_soft(adap);     if (ret < 0) {         dev_err(adap, "%s: t4_sge_init_soft failed, error %d\n", @@ -2516,6 +2587,7 @@ int t4vf_sge_init(struct adapter *adap)     s->stat_len = ((sge_control & F_EGRSTATUSPAGESIZE)             ? 128 : 64);     s->pktshift = G_PKTSHIFT(sge_control); +   s->fl_align = RTE_CACHE_LINE_SIZE;       /*     * A FL with <= fl_starve_thres buffers is starving and a periodic --  2.39.5 

[-- Attachment #2: Type: text/html, Size: 50184 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses
  2026-04-27 16:40 ` [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses Abdulrahman Alshawi
@ 2026-04-28  7:05   ` Abdulrahman Alshawi
  0 siblings, 0 replies; 5+ messages in thread
From: Abdulrahman Alshawi @ 2026-04-28  7:05 UTC (permalink / raw)
  To: dev; +Cc: bharat, stable, vipinpv85

[-- Attachment #1: Type: text/plain, Size: 9387 bytes --]

 

Adding Vipin Varghese

 
 
  
 
 
 

-----Original Message-----

From: Abdulrahman <ashawi@wirefilter.com>
To: dev <dev@dpdk.org>
Cc: bharat <bharat@chelsio.com>; stable <stable@dpdk.org>
Date: Monday, 27 April 2026 7:40 PM +03
Subject: [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses

 
 The Rx path assumes every SGE response starts a new Free List buffer:       BUG_ON(!(len & F_RSPD_NEWBUF));   That is not always true. On T5/T6, small packets can be packed into the same FL buffer. Only the first response for that buffer has NEWBUF set; later responses can refer to the same buffer with a different payload offset.   The current PMD consumes one FL buffer per response. When packed responses are delivered, the FL consumer state goes out of sync with the hardware and the affected ingress queue stops making progress. From user space this shows up as:     - rx_bgN_dropped_packets increasing at line rate   - q_ipackets for the affected queue staying at 0   - imissed increasing   - no recovery until the port is restarted   Fix this by tracking packed-buffer alignment and copying packet payload from the current FL buffer at q->offset, only freeing the FL buffer once it has actually been consumed.   This patch:   - adds sge::fl_align from the ingress pad/pack settings   - introduces cxgbe_copy_rx_pkt() to copy from packed FL buffers   - advances q->offset by len rounded up to fl_align   - frees an FL buffer only after full consumption   - drains the previous buffer when a later response arrives with NEWBUF   Reproduce (T62100-LP-CR, FW 2.1.19.0 / TP 0.1.23.2):     dpdk-testpmd -l 1-9 -a 0000:18:00.4 -- \              --rxq=32 --txq=32 --nb-cores=8 --forward-mode=rxonly -i     testpmd> port stop all   testpmd> flow flush 0   testpmd> flow create 0 ingress pattern eth \            / vlan tci spec 0 tci mask 0x0007 \            / end actions queue index 5 / end   testpmd> port start all   testpmd> start   Without this patch, rx_qN_packets stays at 0 while rx_bgN_dropped_packets rises at line rate. With the patch, rx_qN_packets tracks received traffic and rx_bgN_dropped_packets stays at 0.   Signed-off-by: Abdulrahman Alshawi <ashawi@wirefilter.com> ---  drivers/net/cxgbe/base/adapter.h |   1 +  drivers/net/cxgbe/sge.c          | 122 ++++++++++++++++++++++++-------  2 files changed, 98 insertions(+), 25 deletions(-)   diff --git a/drivers/net/cxgbe/base/adapter.h b/drivers/net/cxgbe/base/adapter.h index 207f3ecb88..e67cf22950 100644 --- a/drivers/net/cxgbe/base/adapter.h +++ b/drivers/net/cxgbe/base/adapter.h @@ -280,6 +280,7 @@ struct sge {     u16 max_ethqsets;           /* # of available Ethernet queue sets */     u32 stat_len;               /* length of status page at ring end */     u32 pktshift;               /* padding between CPL & packet data */ +   u32 fl_align;               /* packed Rx packet alignment */       /* response queue interrupt parameters */     u16 timer_val[SGE_NTIMERS]; diff --git a/drivers/net/cxgbe/sge.c b/drivers/net/cxgbe/sge.c index e9d45f24c4..7cf5c70775 100644 --- a/drivers/net/cxgbe/sge.c +++ b/drivers/net/cxgbe/sge.c @@ -1492,6 +1492,90 @@ static inline void cxgbe_fill_mbuf_info(struct adapter *adap,                    RTE_MBUF_F_RX_L4_CKSUM_BAD);  }   +static int cxgbe_copy_rx_pkt(struct sge_rspq *q, struct sge_eth_rxq *rxq, +               u32 len, struct rte_mbuf **out) +{ +   struct sge *s = &q->adapter->sge; +   struct rte_mbuf *pkt; +   char *dst; +   u32 copied = 0; +   u32 remaining = len; + +   pkt = rte_pktmbuf_alloc(q->mb_pool); +   if (unlikely(!pkt)) { +       rxq->rspq.eth_dev->data->rx_mbuf_alloc_failed++; +       rxq->stats.rx_drops++; +       return -ENOMEM; +   } + +   dst = rte_pktmbuf_append(pkt, len); +   if (unlikely(!dst)) { +       rte_pktmbuf_free(pkt); +       rxq->stats.rx_drops++; +       return -ENOMEM; +   } + +   while (remaining) { +       const struct rx_sw_desc *rsd = &rxq->fl.sdesc[rxq->fl.cidx]; +       struct rte_mbuf *src = rsd->buf; +       u32 bufsz = get_buf_size(q->adapter, rsd); +       u32 copy_len; + +       if (unlikely(!src || q->offset < 0 || +               (u32)q->offset >= bufsz)) { +           rte_pktmbuf_free(pkt); +           rxq->stats.rx_drops++; +           return -EINVAL; +       } + +       copy_len = RTE_MIN(bufsz - (u32)q->offset, remaining); +       rte_memcpy(dst + copied, +             rte_pktmbuf_mtod_offset(src, const void *, q->offset), +             copy_len); + +       copied += copy_len; +       remaining -= copy_len; +       q->offset += copy_len; + +       if (remaining) { +           free_rx_bufs(&rxq->fl, 1); +           q->offset = 0; +           continue; +       } + +       q->offset = RTE_ALIGN_CEIL(q->offset, s->fl_align); +       if ((u32)q->offset >= bufsz) { +           free_rx_bufs(&rxq->fl, 1); +           q->offset = 0; +       } +   } + +   *out = pkt; +   return 0; +} + +static unsigned int cxgbe_fl_pkt_align(struct adapter *adap) +{ +   u32 sge_control = t4_read_reg(adap, A_SGE_CONTROL); +   unsigned int ingpad_shift, ingpad, fl_align; + +   ingpad_shift = CHELSIO_CHIP_VERSION(adap->params.chip) <= CHELSIO_T5 ? +             X_INGPADBOUNDARY_SHIFT : X_T6_INGPADBOUNDARY_SHIFT; +   ingpad = 1U << (G_INGPADBOUNDARY(sge_control) + ingpad_shift); +   fl_align = ingpad; + +   if (!is_t4(adap->params.chip)) { +       u32 sge_control2 = t4_read_reg(adap, A_SGE_CONTROL2); +       unsigned int ingpack = G_INGPACKBOUNDARY(sge_control2); + +       ingpack = ingpack == X_INGPACKBOUNDARY_16B ? +            16 : 1U << (ingpack + X_INGPACKBOUNDARY_SHIFT); +       fl_align = RTE_MAX(ingpad, ingpack); +   } + +   return fl_align ? fl_align : RTE_CACHE_LINE_SIZE; +} +  /**   * process_responses - process responses from an SGE response queue   * @q: the ingress queue to process @@ -1535,14 +1619,12 @@ static int process_responses(struct sge_rspq *q, int budget,             stat_pidx = ntohs(q->stat->pidx);             stat_pidx_diff = P_IDXDIFF(q, stat_pidx);             while (stat_pidx_diff && budget_left) { -               const struct rx_sw_desc *rsd = -                   &rxq->fl.sdesc[rxq->fl.cidx];                 const struct rss_header *rss_hdr =                     (const void *)q->cur_desc;                 const struct cpl_rx_pkt *cpl =                     (const void *)&q->cur_desc[1]; -               struct rte_mbuf *pkt, *npkt; -               u32 len, bufsz; +               struct rte_mbuf *pkt; +               u32 len;                   rc = (const struct rsp_ctrl *)                     ((const char *)q->cur_desc + @@ -1553,28 +1635,16 @@ static int process_responses(struct sge_rspq *q, int budget,                     break;                   len = ntohl(rc->pldbuflen_qid); -               BUG_ON(!(len & F_RSPD_NEWBUF)); -               pkt = rsd->buf; -               npkt = pkt; -               len = G_RSPD_LEN(len); -               pkt->pkt_len = len; - -               /* Chain mbufs into len if necessary */ -               while (len) { -                   struct rte_mbuf *new_pkt = rsd->buf; - -                   bufsz = min(get_buf_size(q->adapter, -                               rsd), len); -                   new_pkt->data_len = bufsz; -                   unmap_rx_buf(&rxq->fl); -                   len -= bufsz; -                   npkt->next = new_pkt; -                   npkt = new_pkt; -                   pkt->nb_segs++; -                   rsd = &rxq->fl.sdesc[rxq->fl.cidx]; +               if (len & F_RSPD_NEWBUF) { +                   if (q->offset > 0) { +                       free_rx_bufs(&rxq->fl, 1); +                       q->offset = 0; +                   }                 } -               npkt->next = NULL; -               pkt->nb_segs--; +               len = G_RSPD_LEN(len); +               ret = cxgbe_copy_rx_pkt(q, rxq, len, &pkt); +               if (unlikely(ret)) +                   break;                   cxgbe_fill_mbuf_info(q->adapter, cpl, pkt);   @@ -2379,6 +2449,7 @@ int t4_sge_init(struct adapter *adap)     sge_control = t4_read_reg(adap, A_SGE_CONTROL);     s->pktshift = G_PKTSHIFT(sge_control);     s->stat_len = (sge_control & F_EGRSTATUSPAGESIZE) ? 128 : 64; +   s->fl_align = cxgbe_fl_pkt_align(adap);     ret = t4_sge_init_soft(adap);     if (ret < 0) {         dev_err(adap, "%s: t4_sge_init_soft failed, error %d\n", @@ -2516,6 +2587,7 @@ int t4vf_sge_init(struct adapter *adap)     s->stat_len = ((sge_control & F_EGRSTATUSPAGESIZE)             ? 128 : 64);     s->pktshift = G_PKTSHIFT(sge_control); +   s->fl_align = RTE_CACHE_LINE_SIZE;       /*     * A FL with <= fl_starve_thres buffers is starving and a periodic --  2.39.5     

[-- Attachment #2: Type: text/html, Size: 50050 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port
  2026-04-27 16:30 [PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping Abdulrahman Alshawi
  2026-04-27 16:40 ` [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses Abdulrahman Alshawi
@ 2026-04-27 16:42 ` Abdulrahman Alshawi
  2026-04-28  7:03   ` Abdulrahman Alshawi
  1 sibling, 1 reply; 5+ messages in thread
From: Abdulrahman Alshawi @ 2026-04-27 16:42 UTC (permalink / raw)
  To: dev; +Cc: bharat, stable

[-- Attachment #1: Type: text/plain, Size: 5361 bytes --]

 Chelsio filters are programmed in adapter-wide LE/TCAM tables shared by all ports. rte_flow rules, however, are created on a specific ethdev and are expected to apply to traffic arriving on that port.   The PMD already supports ingress-port matching in the hardware filter spec. The iport field is validated, used for hash-region selection when tp.port_shift is available, and emitted in the firmware filter work request. But the rte_flow parser never sets fs.val.iport/fs.mask.iport for normal per-port rules.   As a result, a rule created on one port is installed as an adapter-wide match and can steer traffic received on sibling ports of the same adapter.   In practice this causes cross-port steering. For example, a rule like     vlan 100 -> queue 3   created on port 0 can also match VLAN 100 traffic arriving on port 1 and redirect it into port 0's queue 3.   Fix this by stamping the creating ethdev's physical ingress port into the filter spec before filter placement is decided.   Only do this when the active filter mode includes the port field (tp.port_shift >= 0). If port matching is not available in the current filter mode, keep the existing adapter-wide behavior.   Reproduce (two ports of the same adapter bound to DPDK):     dpdk-testpmd -l 1-9 -a 0000:18:00.4 -a 0000:18:00.5 \           -- --rxq=4 --txq=4 --forward-mode=rxonly -i   testpmd> flow create 0 ingress pattern eth \            / vlan tci is 100 / end actions queue index 3 / end   testpmd> start   Without this patch, VLAN 100 traffic received on port 1 can be steered by the rule created on port 0. With the patch, the rule only matches traffic arriving on port 0.   Signed-off-by: Abdulrahman Alshawi <ashawi@wirefilter.com> ---  .mailmap                       |  1 +  drivers/net/cxgbe/cxgbe_flow.c | 19 +++++++++++++++++++  2 files changed, 20 insertions(+)   diff --git a/.mailmap b/.mailmap index 0e0d83e1c6..a6bcbd5756 100644 --- a/.mailmap +++ b/.mailmap @@ -4,6 +4,7 @@ Aaro Koskinen <aaro.koskinen@nsn.com>  Aaron Campbell <aaron@arbor.net>  Aaron Conole <aconole@redhat.com>  Abdullah Ömer Yamaç <omer.yamac@ceng.metu.edu.tr> <aomeryamac@gmail.com> +Abdulrahman Alshawi <ashawi@wirefilter.com>  Abdullah Sevincer <abdullah.sevincer@intel.com>  Abed Kamaluddin <akamaluddin@marvell.com>  Abhijit Gangurde <abhijit.gangurde@amd.com> diff --git a/drivers/net/cxgbe/cxgbe_flow.c b/drivers/net/cxgbe/cxgbe_flow.c index 14b9b49792..dd0634131e 100644 --- a/drivers/net/cxgbe/cxgbe_flow.c +++ b/drivers/net/cxgbe/cxgbe_flow.c @@ -172,6 +172,24 @@ cxgbe_fill_filter_region(struct adapter *adap,   fs->cap = 1; /* use hash region */  }   +static void +cxgbe_scope_flow_to_port(struct rte_flow *flow) +{ + struct adapter *adap = ethdev2adap(flow->dev); + struct port_info *pi = ethdev2pinfo(flow->dev); + + /* + * Chelsio filters are programmed in adapter-global tables. DPDK + * ingress rte_flow rules are created on a specific ethdev, so include + * the physical ingress port when the active filter mode supports it. + */ + if (adap->params.tp.port_shift < 0) +  return; + + flow->fs.val.iport = pi->port_id; + flow->fs.mask.iport = (1U << IPORT_BITWIDTH) - 1; +} +  static int  ch_rte_parsetype_eth(const void *dmask, const struct rte_flow_item *item,        struct ch_filter_specification *fs, @@ -986,6 +1004,7 @@ cxgbe_rtef_parse_items(struct rte_flow *flow,   }     cxgbe_tweak_filter_spec(adap, &flow->fs); + cxgbe_scope_flow_to_port(flow);   cxgbe_fill_filter_region(adap, &flow->fs);     return 0; --  2.39.5  

-----Original Message-----

From: Abdulrahman <ashawi@wirefilter.com>
To: dev <dev@dpdk.org>
Cc: bharat <bharat@chelsio.com>; stable <stable@dpdk.org>
Date: Monday, 27 April 2026 7:30 PM +03
Subject: [PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping

This series fixes two correctness issues in the cxgbe PMD that can cause
traffic loss on Chelsio T6 adapters, especially when rte_flow QUEUE
rules concentrate ingress on a small set of RX queues. 

Patch 1 fixes packed Rx response handling. The current PMD assumes every
response descriptor starts a new Free List buffer by requiring
F_RSPD_NEWBUF on each response. That assumption does not always hold for
packed ingress responses. Under sustained small-packet traffic to a
single ingress queue, the FL/IQ state goes out of sync and the affected
Rx path stops making forward progress. 

Patch 2 scopes rte_flow rules to the ingress port they were created on.
Chelsio filters are programmed in adapter-wide tables, and the PMD
already supports the iport field in the hardware filter spec. However,
the flow parser never fills it for normal per-port rules, so a rule
created on one port can also match traffic arriving on sibling ports of
the same adapter. 

Both issues reproduce with stock testpmd on T62100-LP-CR. The per-patch
commit messages include the details and reproducers. 

Abdulrahman Alshawi (2):
net/cxgbe: fix Rx handling for packed responses
net/cxgbe: restrict rte_flow rules to ingress port 

.mailmap | 1 +
drivers/net/cxgbe/base/adapter.h | 1 +
drivers/net/cxgbe/cxgbe_flow.c | 19 +++++
drivers/net/cxgbe/sge.c | 122 ++++++++++++++++++++++++-------
4 files changed, 118 insertions(+), 25 deletions(-) 

--
2.39.5   

[-- Attachment #2: Type: text/html, Size: 25950 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port
  2026-04-27 16:42 ` [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port Abdulrahman Alshawi
@ 2026-04-28  7:03   ` Abdulrahman Alshawi
  0 siblings, 0 replies; 5+ messages in thread
From: Abdulrahman Alshawi @ 2026-04-28  7:03 UTC (permalink / raw)
  To: dev; +Cc: bharat, stable, vipinpv85

[-- Attachment #1: Type: text/plain, Size: 5657 bytes --]

Adding Vipin Varghese

-----Original Message-----

From: Abdulrahman <ashawi@wirefilter.com>
To: dev <dev@dpdk.org>
Cc: bharat <bharat@chelsio.com>; stable <stable@dpdk.org>
Date: Monday, 27 April 2026 7:42 PM +03
Subject: [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port

 Chelsio filters are programmed in adapter-wide LE/TCAM tables shared by all ports. rte_flow rules, however, are created on a specific ethdev and are expected to apply to traffic arriving on that port.   The PMD already supports ingress-port matching in the hardware filter spec. The iport field is validated, used for hash-region selection when tp.port_shift is available, and emitted in the firmware filter work request. But the rte_flow parser never sets fs.val.iport/fs.mask.iport for normal per-port rules.   As a result, a rule created on one port is installed as an adapter-wide match and can steer traffic received on sibling ports of the same adapter.   In practice this causes cross-port steering. For example, a rule like     vlan 100 -> queue 3   created on port 0 can also match VLAN 100 traffic arriving on port 1 and redirect it into port 0's queue 3.   Fix this by stamping the creating ethdev's physical ingress port into the filter spec before filter placement is decided.   Only do this when the active filter mode includes the port field (tp.port_shift >= 0). If port matching is not available in the current filter mode, keep the existing adapter-wide behavior.   Reproduce (two ports of the same adapter bound to DPDK):     dpdk-testpmd -l 1-9 -a 0000:18:00.4 -a 0000:18:00.5 \           -- --rxq=4 --txq=4 --forward-mode=rxonly -i   testpmd> flow create 0 ingress pattern eth \            / vlan tci is 100 / end actions queue index 3 / end   testpmd> start   Without this patch, VLAN 100 traffic received on port 1 can be steered by the rule created on port 0. With the patch, the rule only matches traffic arriving on port 0.   Signed-off-by: Abdulrahman Alshawi <ashawi@wirefilter.com> ---  .mailmap                       |  1 +  drivers/net/cxgbe/cxgbe_flow.c | 19 +++++++++++++++++++  2 files changed, 20 insertions(+)   diff --git a/.mailmap b/.mailmap index 0e0d83e1c6..a6bcbd5756 100644 --- a/.mailmap +++ b/.mailmap @@ -4,6 +4,7 @@ Aaro Koskinen <aaro.koskinen@nsn.com>  Aaron Campbell <aaron@arbor.net>  Aaron Conole <aconole@redhat.com>  Abdullah Ömer Yamaç <omer.yamac@ceng.metu.edu.tr> <aomeryamac@gmail.com> +Abdulrahman Alshawi <ashawi@wirefilter.com>  Abdullah Sevincer <abdullah.sevincer@intel.com>  Abed Kamaluddin <akamaluddin@marvell.com>  Abhijit Gangurde <abhijit.gangurde@amd.com> diff --git a/drivers/net/cxgbe/cxgbe_flow.c b/drivers/net/cxgbe/cxgbe_flow.c index 14b9b49792..dd0634131e 100644 --- a/drivers/net/cxgbe/cxgbe_flow.c +++ b/drivers/net/cxgbe/cxgbe_flow.c @@ -172,6 +172,24 @@ cxgbe_fill_filter_region(struct adapter *adap,   fs->cap = 1; /* use hash region */  }   +static void +cxgbe_scope_flow_to_port(struct rte_flow *flow) +{ + struct adapter *adap = ethdev2adap(flow->dev); + struct port_info *pi = ethdev2pinfo(flow->dev); + + /* + * Chelsio filters are programmed in adapter-global tables. DPDK + * ingress rte_flow rules are created on a specific ethdev, so include + * the physical ingress port when the active filter mode supports it. + */ + if (adap->params.tp.port_shift < 0) +  return; + + flow->fs.val.iport = pi->port_id; + flow->fs.mask.iport = (1U << IPORT_BITWIDTH) - 1; +} +  static int  ch_rte_parsetype_eth(const void *dmask, const struct rte_flow_item *item,        struct ch_filter_specification *fs, @@ -986,6 +1004,7 @@ cxgbe_rtef_parse_items(struct rte_flow *flow,   }     cxgbe_tweak_filter_spec(adap, &flow->fs); + cxgbe_scope_flow_to_port(flow);   cxgbe_fill_filter_region(adap, &flow->fs);     return 0; --  2.39.5  

From: Abdulrahman <ashawi@wirefilter.com>
To: dev <dev@dpdk.org>
Cc: bharat <bharat@chelsio.com>; stable <stable@dpdk.org>
Date: Monday, 27 April 2026 7:30 PM +03
Subject: [PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping

This series fixes two correctness issues in the cxgbe PMD that can cause
traffic loss on Chelsio T6 adapters, especially when rte_flow QUEUE
rules concentrate ingress on a small set of RX queues. 

Patch 1 fixes packed Rx response handling. The current PMD assumes every
response descriptor starts a new Free List buffer by requiring
F_RSPD_NEWBUF on each response. That assumption does not always hold for
packed ingress responses. Under sustained small-packet traffic to a
single ingress queue, the FL/IQ state goes out of sync and the affected
Rx path stops making forward progress. 

Patch 2 scopes rte_flow rules to the ingress port they were created on.
Chelsio filters are programmed in adapter-wide tables, and the PMD
already supports the iport field in the hardware filter spec. However,
the flow parser never fills it for normal per-port rules, so a rule
created on one port can also match traffic arriving on sibling ports of
the same adapter. 

Both issues reproduce with stock testpmd on T62100-LP-CR. The per-patch
commit messages include the details and reproducers. 

Abdulrahman Alshawi (2):
net/cxgbe: fix Rx handling for packed responses
net/cxgbe: restrict rte_flow rules to ingress port 

.mailmap | 1 +
drivers/net/cxgbe/base/adapter.h | 1 +
drivers/net/cxgbe/cxgbe_flow.c | 19 +++++
drivers/net/cxgbe/sge.c | 122 ++++++++++++++++++++++++-------
4 files changed, 118 insertions(+), 25 deletions(-) 

--
2.39.5   

[-- Attachment #2: Type: text/html, Size: 26503 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-29  9:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 16:30 [PATCH 0/2] net/cxgbe: fix packed Rx handling and flow port scoping Abdulrahman Alshawi
2026-04-27 16:40 ` [PATCH 1/2] net/cxgbe: fix Rx handling for packed responses Abdulrahman Alshawi
2026-04-28  7:05   ` Abdulrahman Alshawi
2026-04-27 16:42 ` [PATCH 2/2] net/cxgbe: restrict rte flow rules to ingress port Abdulrahman Alshawi
2026-04-28  7:03   ` Abdulrahman Alshawi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox