* [Patch net-next v6 0/7] r8169: add RSS support for RTL8127
@ 2026-05-26 8:11 javen
2026-05-26 8:11 ` [Patch net-next v6 1/7] r8169: add support for multi irqs javen
` (7 more replies)
0 siblings, 8 replies; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
This patch series adds RSS (Receive Side Scaling) support for the r8169
ethernet driver, specifically for RTL8127 (RTL_GIGA_MAC_VER_80).
RSS enables packet distribution across multiple receive queues, which can
significantly improve network throughput on multi-core systems by allowing
parallel processing of incoming packets.
Key features:
- Multi-queue RX support (up to 8 queues)
- MSI-X interrupt with vector mapping
- Dynamic queue configuration via ethtool (-L)
- RSS hash computation for flow classification
Experiments:
Platform: AMD Ryzen Embedded R2514 with Radeon Graphics(4 Cores/8 Threads)
Arch: x86_64
Test command:
Server: iperf3 -s
Client: iperf3 -c 192.168.2.1 -P 20 -t 3600
Monitor: mpstat -P ALL 1
Before this patch (Without RSS):
Throughput: Unstable, fluctuating between 3.76 Gbits/sec and
8.2 Gbits/sec.
CPU Usage: A single CPU core is fully occupied with softirq reaching
up to 96%.
After this patch (With RSS enabled):
Throughput: Stable at 9.42 Gbits/sec.
CPU Usage: The traffic load is evenly distributed across multiple CPU
cores. The maximum softirq on a single core dropped to 63%.
Other Experiments:
Link: https://lore.kernel.org/netdev/0A5279953D81BB9C+f50c9b49-3e5d-467f-b69a-7e49ed223383@radxa.com/
Javen Xu (7):
r8169: add support for multi irqs
r8169: add support for multi rx queues
r8169: add support for new interrupt mapping
r8169: enable new interrupt mapping
r8169: add support and enable rss
r8169: move struct ethtool_ops
r8169: support setting rx queue numbers via ethtool
drivers/net/ethernet/realtek/r8169_main.c | 1118 ++++++++++++++++++---
1 file changed, 977 insertions(+), 141 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Patch net-next v6 1/7] r8169: add support for multi irqs
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
@ 2026-05-26 8:11 ` javen
2026-05-29 1:00 ` Jakub Kicinski
2026-06-02 21:22 ` Heiner Kallweit
2026-05-26 8:11 ` [Patch net-next v6 2/7] r8169: add support for multi rx queues javen
` (6 subsequent siblings)
7 siblings, 2 replies; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
RSS uses multi rx queues to receive packets, and each rx queue needs one
irq and napi. So this patch adds support for multi irqs and napi here.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- remove some unused definitions, such as index, name in rtl8169_irq
- remove array imr and isr
- remove min_irq_nvecs and max_irq_nvecs, replaced with help function
get_min_irq_nvecs and get_max_irq_nvecs
- alloc irq by flags, instead of PCI_IRQ_ALL_TYPES
Changes in v3:
- add enum rtl_isr_version to replace macro definition
- remove struct rtl8169_napi, use napi_struct array instead and alloc
memory for this array dynamically
- remove struct rtl8169_irq
Changes in v4:
- change retval to ret in rtl8169_set_real_num_queue()
- reverse xmas tree in rtl8169_poll() and rtl8169_interrupt()
- remove tp->hw_supp_isr_ver
Changes in v5:
- rtl8169_request_irq(), when failed, only free irqs which are
allocated
- remove rss_support, simplied napi init, call r8169_init_napi()
directly
- remove rtl_isr_version, INTR_VEC_MAP_MASK, INTR_VEC_MAP_STATUS,
R8169_MAX_MSIX_VEC, rss_enable, recheck_desc_ownbit
- rtl_software_parameter_initialize() this function will be expanded in
next patch, so i want to remain it here.
Changes in v6:
- Fix netpoll crash
- Fix use-after-free during driver unload by registering a devm action
for netif_napi_del()
- remove tp->irq
---
drivers/net/ethernet/realtek/r8169_main.c | 144 ++++++++++++++++++----
1 file changed, 120 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index ec4fc21fa21f..22e843baffc7 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -733,7 +733,6 @@ struct rtl8169_private {
struct pci_dev *pci_dev;
struct net_device *dev;
struct phy_device *phydev;
- struct napi_struct napi;
enum mac_version mac_version;
enum rtl_dash_type dash_type;
u32 cur_rx; /* Index into the Rx descriptor buffer of next Rx pkt. */
@@ -745,10 +744,12 @@ struct rtl8169_private {
dma_addr_t RxPhyAddr;
struct page *Rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
struct ring_info tx_skb[NUM_TX_DESC]; /* Tx data buffers */
+ struct napi_struct *rtl8169_napi;
+ unsigned int num_rx_rings;
u16 cp_cmd;
u16 tx_lpi_timer;
u32 irq_mask;
- int irq;
+ unsigned int irq_nvecs;
struct clk *clk;
struct {
@@ -2680,6 +2681,11 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
rtl_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
}
+static void rtl_software_parameter_initialize(struct rtl8169_private *tp)
+{
+ tp->num_rx_rings = 1;
+}
+
static void rtl_request_firmware(struct rtl8169_private *tp)
{
struct rtl_fw *rtl_fw;
@@ -4266,9 +4272,21 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
netdev_reset_queue(tp->dev);
}
+static void rtl8169_napi_disable(struct rtl8169_private *tp)
+{
+ for (int i = 0; i < tp->irq_nvecs; i++)
+ napi_disable(&tp->rtl8169_napi[i]);
+}
+
+static void rtl8169_napi_enable(struct rtl8169_private *tp)
+{
+ for (int i = 0; i < tp->irq_nvecs; i++)
+ napi_enable(&tp->rtl8169_napi[i]);
+}
+
static void rtl8169_cleanup(struct rtl8169_private *tp)
{
- napi_disable(&tp->napi);
+ rtl8169_napi_disable(tp);
/* Give a racing hard_start_xmit a few cycles to complete. */
synchronize_net();
@@ -4314,7 +4332,7 @@ static void rtl_reset_work(struct rtl8169_private *tp)
for (i = 0; i < NUM_RX_DESC; i++)
rtl8169_mark_to_asic(tp->RxDescArray + i);
- napi_enable(&tp->napi);
+ rtl8169_napi_enable(tp);
rtl_hw_start(tp);
}
@@ -4820,7 +4838,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
goto release_descriptor;
}
- skb = napi_alloc_skb(&tp->napi, pkt_size);
+ skb = napi_alloc_skb(&tp->rtl8169_napi[0], pkt_size);
if (unlikely(!skb)) {
dev->stats.rx_dropped++;
goto release_descriptor;
@@ -4844,7 +4862,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
if (skb->pkt_type == PACKET_MULTICAST)
dev->stats.multicast++;
- napi_gro_receive(&tp->napi, skb);
+ napi_gro_receive(&tp->rtl8169_napi[0], skb);
dev_sw_netstats_rx_add(dev, pkt_size);
release_descriptor:
@@ -4856,8 +4874,12 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
{
- struct rtl8169_private *tp = dev_instance;
- u32 status = rtl_get_events(tp);
+ struct napi_struct *napi = dev_instance;
+ struct rtl8169_private *tp;
+ u32 status;
+
+ tp = netdev_priv(napi->dev);
+ status = rtl_get_events(tp);
if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask))
return IRQ_NONE;
@@ -4873,13 +4895,43 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
phy_mac_interrupt(tp->phydev);
rtl_irq_disable(tp);
- napi_schedule(&tp->napi);
+ napi_schedule(napi);
out:
rtl_ack_events(tp, status);
return IRQ_HANDLED;
}
+static void rtl8169_free_irq(struct rtl8169_private *tp)
+{
+ for (int i = 0; i < tp->irq_nvecs; i++) {
+ struct napi_struct *napi = &tp->rtl8169_napi[i];
+
+ pci_free_irq(tp->pci_dev, i, napi);
+ }
+}
+
+static int rtl8169_request_irq(struct rtl8169_private *tp)
+{
+ struct net_device *dev = tp->dev;
+ struct napi_struct *napi;
+ int i, rc;
+
+ for (i = 0; i < tp->irq_nvecs; i++) {
+ napi = &tp->rtl8169_napi[i];
+ rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt,
+ NULL, napi, "%s-%d", dev->name, i);
+ if (rc)
+ goto free_irq;
+ }
+ return 0;
+
+free_irq:
+ while (--i >= 0)
+ pci_free_irq(tp->pci_dev, i, &tp->rtl8169_napi[i]);
+ return rc;
+}
+
static void rtl_task(struct work_struct *work)
{
struct rtl8169_private *tp =
@@ -4914,9 +4966,9 @@ static void rtl_task(struct work_struct *work)
static int rtl8169_poll(struct napi_struct *napi, int budget)
{
- struct rtl8169_private *tp = container_of(napi, struct rtl8169_private, napi);
- struct net_device *dev = tp->dev;
- int work_done;
+ struct rtl8169_private *tp = netdev_priv(napi->dev);
+ struct net_device *dev = napi->dev;
+ int work_done = 0;
rtl_tx(dev, tp, budget);
@@ -5035,7 +5087,7 @@ static void rtl8169_up(struct rtl8169_private *tp)
phy_init_hw(tp->phydev);
phy_resume(tp->phydev);
rtl8169_init_phy(tp);
- napi_enable(&tp->napi);
+ rtl8169_napi_enable(tp);
enable_work(&tp->wk.work);
rtl_reset_work(tp);
@@ -5053,7 +5105,7 @@ static int rtl8169_close(struct net_device *dev)
rtl8169_down(tp);
rtl8169_rx_clear(tp);
- free_irq(tp->irq, tp);
+ rtl8169_free_irq(tp);
phy_disconnect(tp->phydev);
@@ -5074,7 +5126,7 @@ static void rtl8169_netpoll(struct net_device *dev)
{
struct rtl8169_private *tp = netdev_priv(dev);
- rtl8169_interrupt(tp->irq, tp);
+ rtl8169_interrupt(pci_irq_vector(tp->pci_dev, 0), &tp->rtl8169_napi[0]);
}
#endif
@@ -5082,7 +5134,6 @@ static int rtl_open(struct net_device *dev)
{
struct rtl8169_private *tp = netdev_priv(dev);
struct pci_dev *pdev = tp->pci_dev;
- unsigned long irqflags;
int retval = -ENOMEM;
pm_runtime_get_sync(&pdev->dev);
@@ -5107,8 +5158,7 @@ static int rtl_open(struct net_device *dev)
rtl_request_firmware(tp);
- irqflags = pci_dev_msi_enabled(pdev) ? IRQF_NO_THREAD : IRQF_SHARED;
- retval = request_irq(tp->irq, rtl8169_interrupt, irqflags, dev->name, tp);
+ retval = rtl8169_request_irq(tp);
if (retval < 0)
goto err_release_fw_2;
@@ -5125,7 +5175,7 @@ static int rtl_open(struct net_device *dev)
return retval;
err_free_irq:
- free_irq(tp->irq, tp);
+ rtl8169_free_irq(tp);
err_release_fw_2:
rtl_release_firmware(tp);
rtl8169_rx_clear(tp);
@@ -5328,7 +5378,9 @@ static void rtl_set_irq_mask(struct rtl8169_private *tp)
static int rtl_alloc_irq(struct rtl8169_private *tp)
{
+ struct pci_dev *pdev = tp->pci_dev;
unsigned int flags;
+ int nvecs;
switch (tp->mac_version) {
case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_06:
@@ -5344,7 +5396,14 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
break;
}
- return pci_alloc_irq_vectors(tp->pci_dev, 1, 1, flags);
+ nvecs = pci_alloc_irq_vectors(pdev, 1, 1, flags);
+
+ if (nvecs < 0)
+ return nvecs;
+
+ tp->irq_nvecs = nvecs;
+
+ return 0;
}
static void rtl_read_mac_address(struct rtl8169_private *tp,
@@ -5539,6 +5598,17 @@ static void rtl_hw_initialize(struct rtl8169_private *tp)
}
}
+static int rtl8169_set_real_num_queues(struct rtl8169_private *tp)
+{
+ int ret;
+
+ ret = netif_set_real_num_tx_queues(tp->dev, 1);
+ if (ret < 0)
+ return ret;
+
+ return netif_set_real_num_rx_queues(tp->dev, tp->num_rx_rings);
+}
+
static int rtl_jumbo_max(struct rtl8169_private *tp)
{
/* Non-GBit versions don't support jumbo frames */
@@ -5599,6 +5669,22 @@ static bool rtl_aspm_is_safe(struct rtl8169_private *tp)
return false;
}
+static void r8169_del_napi_action(void *data)
+{
+ struct rtl8169_private *tp = data;
+ int i;
+
+ for (i = 0; i < tp->irq_nvecs; i++)
+ netif_napi_del(&tp->rtl8169_napi[i]);
+}
+
+static void r8169_init_napi(struct rtl8169_private *tp)
+{
+ for (int i = 0; i < tp->irq_nvecs; i++)
+ netif_napi_add(tp->dev, &tp->rtl8169_napi[i], rtl8169_poll);
+ devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
+}
+
static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
{
const struct rtl_chip_info *chip;
@@ -5703,11 +5789,16 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
rtl_hw_reset(tp);
+ rtl_software_parameter_initialize(tp);
+
rc = rtl_alloc_irq(tp);
if (rc < 0)
return dev_err_probe(&pdev->dev, rc, "Can't allocate interrupt\n");
- tp->irq = pci_irq_vector(pdev, 0);
+ tp->rtl8169_napi = devm_kcalloc(&pdev->dev, tp->irq_nvecs,
+ sizeof(struct napi_struct), GFP_KERNEL);
+ if (!tp->rtl8169_napi)
+ return -ENOMEM;
INIT_WORK(&tp->wk.work, rtl_task);
disable_work(&tp->wk.work);
@@ -5716,7 +5807,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dev->ethtool_ops = &rtl8169_ethtool_ops;
- netif_napi_add(dev, &tp->napi, rtl8169_poll);
+ r8169_init_napi(tp);
dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
@@ -5778,6 +5869,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (jumbo_max)
dev->max_mtu = jumbo_max;
+ rc = rtl8169_set_real_num_queues(tp);
+ if (rc < 0)
+ return dev_err_probe(&pdev->dev, rc, "set tx/rx num failure\n");
+
rtl_set_irq_mask(tp);
tp->counters = dmam_alloc_coherent (&pdev->dev, sizeof(*tp->counters),
@@ -5803,8 +5898,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
tp->leds = rtl8168_init_leds(dev);
}
- netdev_info(dev, "%s, %pM, %sXID %x, IRQ %d\n",
- chip->name, dev->dev_addr, ext_xid_str, xid, tp->irq);
+ netdev_info(dev, "%s, %pM, %sXID %x, IRQ %d (%d total)\n",
+ chip->name, dev->dev_addr, ext_xid_str, xid,
+ pci_irq_vector(pdev, 0), tp->irq_nvecs);
if (jumbo_max)
netdev_info(dev, "jumbo features [frames: %d bytes, tx checksumming: %s]\n",
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Patch net-next v6 2/7] r8169: add support for multi rx queues
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
2026-05-26 8:11 ` [Patch net-next v6 1/7] r8169: add support for multi irqs javen
@ 2026-05-26 8:11 ` javen
2026-05-29 1:04 ` Jakub Kicinski
2026-05-26 8:11 ` [Patch net-next v6 3/7] r8169: add support for new interrupt mapping javen
` (5 subsequent siblings)
7 siblings, 1 reply; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
This patch adds support for multi rx queues. RSS requires multi rx
queues to receive packets. So we need struct rtl8169_rx_ring for each
queue.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- sort some registers by its number
- remove some unused definitions, like RX_DESC_RING_TYPE_MAX
- change recheck_desc_ownbit type
- remove rdsar_reg in rx_ring struct
- opts1 are different in rx_desc and rx_desc_rss, move the judgement
to Patch 5/7
Changes in v3:
- remove ring->rx_desc_alloc_size, use constant instead
Changes in v4:
- change rdsar_reg type to unsigned int
- follow reverse xmas tree, in rtl_set_rx_tx_desc_registers(),
rtl8169_alloc_rx_data(), rtl8169_alloc_rx_desc(),
rtl8169_free_rx_desc()
- add comments on LED_CTRL, remove helper function
Changes in v5:
- modify rtl8169_init_ring(), do rx clear when failed
- add definition R8169_MAX_TX_QUEUES 1
Changes in v6:
- Restore the secondary Rx error filter when NETIF_F_RXFALL is enabled
in rtl_rx()
---
drivers/net/ethernet/realtek/r8169_main.c | 272 +++++++++++++++++-----
1 file changed, 211 insertions(+), 61 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 22e843baffc7..62bf77aa1ec8 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -74,9 +74,13 @@
#define NUM_TX_DESC 256 /* Number of Tx descriptor registers */
#define NUM_RX_DESC 256 /* Number of Rx descriptor registers */
#define R8169_TX_RING_BYTES (NUM_TX_DESC * sizeof(struct TxDesc))
-#define R8169_RX_RING_BYTES (NUM_RX_DESC * sizeof(struct RxDesc))
+#define R8169_RX_RING_BYTES ((NUM_RX_DESC + 1) * sizeof(struct RxDesc))
#define R8169_TX_STOP_THRS (MAX_SKB_FRAGS + 1)
#define R8169_TX_START_THRS (2 * R8169_TX_STOP_THRS)
+#define R8169_MAX_RX_QUEUES 8
+#define R8127_MAX_RX_QUEUES 8
+#define R8169_DEFAULT_RX_QUEUES 1
+#define R8169_MAX_TX_QUEUES 1
#define OCP_STD_PHY_BASE 0xa400
@@ -441,6 +445,7 @@ enum rtl8125_registers {
TxPoll_8125 = 0x90,
LEDSEL3 = 0x96,
MAC0_BKP = 0x19e0,
+ RDSAR_Q1_LOW = 0x4000,
RSS_CTRL_8125 = 0x4500,
Q_NUM_CTRL_8125 = 0x4800,
EEE_TXIDLE_TIMER_8125 = 0x6048,
@@ -728,6 +733,21 @@ enum rtl_dash_type {
RTL_DASH_25_BP,
};
+enum rx_desc_ring_type {
+ RX_DESC_RING_TYPE_DEFAULT,
+ RX_DESC_RING_TYPE_RSS,
+};
+
+struct rtl8169_rx_ring {
+ u32 index; /* Rx queue index */
+ u32 cur_rx; /* Index of next Rx pkt. */
+ u32 dirty_rx; /* Index for recycling. */
+ struct RxDesc *rx_desc_array; /* array of Rx Desc*/
+ dma_addr_t rx_desc_phy_addr[NUM_RX_DESC]; /* Rx data buffer physical dma address */
+ dma_addr_t rx_phy_addr; /* Rx desc physical address */
+ struct page *rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
+};
+
struct rtl8169_private {
void __iomem *mmio_addr; /* memory map physical address */
struct pci_dev *pci_dev;
@@ -735,20 +755,18 @@ struct rtl8169_private {
struct phy_device *phydev;
enum mac_version mac_version;
enum rtl_dash_type dash_type;
- u32 cur_rx; /* Index into the Rx descriptor buffer of next Rx pkt. */
u32 cur_tx; /* Index into the Tx descriptor buffer of next Rx pkt. */
u32 dirty_tx;
struct TxDesc *TxDescArray; /* 256-aligned Tx descriptor ring */
- struct RxDesc *RxDescArray; /* 256-aligned Rx descriptor ring */
dma_addr_t TxPhyAddr;
- dma_addr_t RxPhyAddr;
- struct page *Rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
struct ring_info tx_skb[NUM_TX_DESC]; /* Tx data buffers */
struct napi_struct *rtl8169_napi;
+ struct rtl8169_rx_ring rx_ring[R8169_MAX_RX_QUEUES];
unsigned int num_rx_rings;
u16 cp_cmd;
u16 tx_lpi_timer;
u32 irq_mask;
+ unsigned int hw_supp_num_rx_queues;
unsigned int irq_nvecs;
struct clk *clk;
@@ -764,6 +782,7 @@ struct rtl8169_private {
unsigned aspm_manageable:1;
unsigned dash_enabled:1;
bool sfp_mode:1;
+ bool recheck_desc_ownbit:1;
dma_addr_t counters_phys_addr;
struct rtl8169_counters *counters;
struct rtl8169_tc_offsets tc_offset;
@@ -2620,9 +2639,27 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
}
}
+static void rtl8169_rx_desc_init(struct rtl8169_private *tp)
+{
+ for (int i = 0; i < tp->num_rx_rings; i++) {
+ struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
+
+ memset(ring->rx_desc_array, 0x0, R8169_RX_RING_BYTES);
+ }
+}
+
static void rtl8169_init_ring_indexes(struct rtl8169_private *tp)
{
- tp->dirty_tx = tp->cur_tx = tp->cur_rx = 0;
+ tp->dirty_tx = 0;
+ tp->cur_tx = 0;
+
+ for (int i = 0; i < tp->hw_supp_num_rx_queues; i++) {
+ struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
+
+ ring->dirty_rx = 0;
+ ring->cur_rx = 0;
+ ring->index = i;
+ }
}
static void rtl_jumbo_config(struct rtl8169_private *tp)
@@ -2684,6 +2721,14 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
static void rtl_software_parameter_initialize(struct rtl8169_private *tp)
{
tp->num_rx_rings = 1;
+ switch (tp->mac_version) {
+ case RTL_GIGA_MAC_VER_80:
+ tp->hw_supp_num_rx_queues = R8127_MAX_RX_QUEUES;
+ break;
+ default:
+ tp->hw_supp_num_rx_queues = R8169_DEFAULT_RX_QUEUES;
+ break;
+ }
}
static void rtl_request_firmware(struct rtl8169_private *tp)
@@ -2810,6 +2855,8 @@ static void rtl_set_rx_max_size(struct rtl8169_private *tp)
static void rtl_set_rx_tx_desc_registers(struct rtl8169_private *tp)
{
+ struct rtl8169_rx_ring *ring = &tp->rx_ring[0];
+
/*
* Magic spell: some iop3xx ARM board needs the TxDescAddrHigh
* register to be written before TxDescAddrLow to work.
@@ -2817,8 +2864,16 @@ static void rtl_set_rx_tx_desc_registers(struct rtl8169_private *tp)
*/
RTL_W32(tp, TxDescStartAddrHigh, ((u64) tp->TxPhyAddr) >> 32);
RTL_W32(tp, TxDescStartAddrLow, ((u64) tp->TxPhyAddr) & DMA_BIT_MASK(32));
- RTL_W32(tp, RxDescAddrHigh, ((u64) tp->RxPhyAddr) >> 32);
- RTL_W32(tp, RxDescAddrLow, ((u64) tp->RxPhyAddr) & DMA_BIT_MASK(32));
+ RTL_W32(tp, RxDescAddrHigh, ((u64) ring->rx_phy_addr) >> 32);
+ RTL_W32(tp, RxDescAddrLow, ((u64) ring->rx_phy_addr) & DMA_BIT_MASK(32));
+
+ for (int i = 1; i < tp->num_rx_rings; i++) {
+ unsigned int rdsar_reg = RDSAR_Q1_LOW + (i - 1) * 8;
+ struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
+
+ RTL_W32(tp, rdsar_reg + 4, ((u64)ring->rx_phy_addr >> 32));
+ RTL_W32(tp, rdsar_reg, ((u64)ring->rx_phy_addr) & DMA_BIT_MASK(32));
+ }
}
static void rtl8169_set_magic_reg(struct rtl8169_private *tp)
@@ -4165,8 +4220,9 @@ static void rtl8169_mark_to_asic(struct RxDesc *desc)
}
static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
- struct RxDesc *desc)
+ struct rtl8169_rx_ring *ring, unsigned int index)
{
+ struct RxDesc *desc = ring->rx_desc_array + index;
struct device *d = tp_to_dev(tp);
int node = dev_to_node(d);
dma_addr_t mapping;
@@ -4184,55 +4240,106 @@ static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
}
desc->addr = cpu_to_le64(mapping);
+ ring->rx_desc_phy_addr[index] = mapping;
rtl8169_mark_to_asic(desc);
return data;
}
-static void rtl8169_rx_clear(struct rtl8169_private *tp)
+static void rtl8169_rx_clear(struct rtl8169_private *tp, struct rtl8169_rx_ring *ring)
{
int i;
- for (i = 0; i < NUM_RX_DESC && tp->Rx_databuff[i]; i++) {
+ for (i = 0; i < NUM_RX_DESC && ring->rx_databuff[i]; i++) {
dma_unmap_page(tp_to_dev(tp),
- le64_to_cpu(tp->RxDescArray[i].addr),
+ ring->rx_desc_phy_addr[i],
R8169_RX_BUF_SIZE, DMA_FROM_DEVICE);
- __free_pages(tp->Rx_databuff[i], get_order(R8169_RX_BUF_SIZE));
- tp->Rx_databuff[i] = NULL;
- tp->RxDescArray[i].addr = 0;
- tp->RxDescArray[i].opts1 = 0;
+ __free_pages(ring->rx_databuff[i], get_order(R8169_RX_BUF_SIZE));
+ ring->rx_databuff[i] = NULL;
+ ring->rx_desc_phy_addr[i] = 0;
+ ring->rx_desc_array[i].addr = 0;
+ ring->rx_desc_array[i].opts1 = 0;
}
}
-static int rtl8169_rx_fill(struct rtl8169_private *tp)
+static int rtl8169_rx_fill(struct rtl8169_private *tp, struct rtl8169_rx_ring *ring)
{
int i;
for (i = 0; i < NUM_RX_DESC; i++) {
struct page *data;
- data = rtl8169_alloc_rx_data(tp, tp->RxDescArray + i);
+ data = rtl8169_alloc_rx_data(tp, ring, i);
if (!data) {
- rtl8169_rx_clear(tp);
+ rtl8169_rx_clear(tp, ring);
return -ENOMEM;
}
- tp->Rx_databuff[i] = data;
+ ring->rx_databuff[i] = data;
}
/* mark as last descriptor in the ring */
- tp->RxDescArray[NUM_RX_DESC - 1].opts1 |= cpu_to_le32(RingEnd);
+ ring->rx_desc_array[NUM_RX_DESC - 1].opts1 |= cpu_to_le32(RingEnd);
return 0;
}
+static int rtl8169_alloc_rx_desc(struct rtl8169_private *tp)
+{
+ struct pci_dev *pdev = tp->pci_dev;
+ struct rtl8169_rx_ring *ring;
+
+ for (int i = 0; i < tp->num_rx_rings; i++) {
+ ring = &tp->rx_ring[i];
+ ring->rx_desc_array = dma_alloc_coherent(&pdev->dev,
+ R8169_RX_RING_BYTES,
+ &ring->rx_phy_addr,
+ GFP_KERNEL);
+ if (!ring->rx_desc_array)
+ return -ENOMEM;
+ }
+ return 0;
+}
+
+static void rtl8169_free_rx_desc(struct rtl8169_private *tp)
+{
+ struct pci_dev *pdev = tp->pci_dev;
+ struct rtl8169_rx_ring *ring;
+
+ for (int i = 0; i < tp->num_rx_rings; i++) {
+ ring = &tp->rx_ring[i];
+ if (ring->rx_desc_array) {
+ dma_free_coherent(&pdev->dev,
+ R8169_RX_RING_BYTES,
+ ring->rx_desc_array,
+ ring->rx_phy_addr);
+ ring->rx_desc_array = NULL;
+ }
+ }
+}
+
static int rtl8169_init_ring(struct rtl8169_private *tp)
{
+ int i, ret;
+
rtl8169_init_ring_indexes(tp);
+ rtl8169_rx_desc_init(tp);
memset(tp->tx_skb, 0, sizeof(tp->tx_skb));
- memset(tp->Rx_databuff, 0, sizeof(tp->Rx_databuff));
- return rtl8169_rx_fill(tp);
+ for (i = 0; i < tp->num_rx_rings; i++) {
+ struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
+
+ memset(ring->rx_databuff, 0, sizeof(ring->rx_databuff));
+ ret = rtl8169_rx_fill(tp, ring);
+ if (ret < 0)
+ goto err_clear;
+ }
+ return 0;
+
+err_clear:
+ while (--i >= 0)
+ rtl8169_rx_clear(tp, &tp->rx_ring[i]);
+ return ret;
}
static void rtl8169_unmap_tx_skb(struct rtl8169_private *tp, unsigned int entry)
@@ -4321,16 +4428,23 @@ static void rtl8169_cleanup(struct rtl8169_private *tp)
rtl8169_init_ring_indexes(tp);
}
-static void rtl_reset_work(struct rtl8169_private *tp)
+static void rtl8169_rx_desc_reset(struct rtl8169_private *tp)
{
- int i;
+ for (int i = 0; i < tp->num_rx_rings; i++) {
+ struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
+ for (int j = 0; j < NUM_RX_DESC; j++)
+ rtl8169_mark_to_asic(ring->rx_desc_array + j);
+ }
+}
+
+static void rtl_reset_work(struct rtl8169_private *tp)
+{
netif_stop_queue(tp->dev);
rtl8169_cleanup(tp);
- for (i = 0; i < NUM_RX_DESC; i++)
- rtl8169_mark_to_asic(tp->RxDescArray + i);
+ rtl8169_rx_desc_reset(tp);
rtl8169_napi_enable(tp);
rtl_hw_start(tp);
@@ -4776,9 +4890,10 @@ static inline int rtl8169_fragmented_frame(u32 status)
return (status & (FirstFrag | LastFrag)) != (FirstFrag | LastFrag);
}
-static inline void rtl8169_rx_csum(struct sk_buff *skb, u32 opts1)
+static inline void rtl8169_rx_csum(struct sk_buff *skb,
+ struct RxDesc *desc)
{
- u32 status = opts1 & (RxProtoMask | RxCSFailMask);
+ u32 status = le32_to_cpu(desc->opts1) & (RxProtoMask | RxCSFailMask);
if (status == RxProtoTCP || status == RxProtoUDP)
skb->ip_summed = CHECKSUM_UNNECESSARY;
@@ -4786,22 +4901,60 @@ static inline void rtl8169_rx_csum(struct sk_buff *skb, u32 opts1)
skb_checksum_none_assert(skb);
}
-static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget)
+static bool rtl8169_check_rx_desc_error(struct net_device *dev,
+ struct rtl8169_private *tp,
+ u32 status)
+{
+ if (unlikely(status & RxRES)) {
+ if (status & (RxRWT | RxRUNT))
+ dev->stats.rx_length_errors++;
+ if (status & RxCRC)
+ dev->stats.rx_crc_errors++;
+ return true;
+ }
+ return false;
+}
+
+static void rtl8169_set_desc_dma_addr(struct RxDesc *desc,
+ dma_addr_t mapping)
+{
+ desc->addr = cpu_to_le64(mapping);
+}
+
+static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
+ struct rtl8169_rx_ring *ring, int budget)
{
struct device *d = tp_to_dev(tp);
int count;
- for (count = 0; count < budget; count++, tp->cur_rx++) {
- unsigned int pkt_size, entry = tp->cur_rx % NUM_RX_DESC;
- struct RxDesc *desc = tp->RxDescArray + entry;
+ for (count = 0; count < budget; count++, ring->cur_rx++) {
+ unsigned int pkt_size, entry = ring->cur_rx % NUM_RX_DESC;
+ struct RxDesc *desc = ring->rx_desc_array + entry;
struct sk_buff *skb;
const void *rx_buf;
dma_addr_t addr;
u32 status;
status = le32_to_cpu(READ_ONCE(desc->opts1));
- if (status & DescOwn)
- break;
+
+ if (status & DescOwn) {
+ if (!tp->recheck_desc_ownbit)
+ break;
+
+ /* Workaround for a hardware issue:
+ * A dummy read to any register forces a PCIe flush. We
+ * choose LED_CTRL here simply because reading it has no
+ * side effects. This ensures the descriptor ownbit is
+ * fully updated in RAM before we recheck it, preventing
+ * from missing RX packets right before exiting NAPI
+ * polling loop.
+ */
+ tp->recheck_desc_ownbit = false;
+ RTL_R8(tp, LED_CTRL);
+ status = le32_to_cpu(READ_ONCE(desc->opts1));
+ if (status & DescOwn)
+ break;
+ }
/* This barrier is needed to keep us from reading
* any other fields out of the Rx descriptor until
@@ -4809,15 +4962,11 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
*/
dma_rmb();
- if (unlikely(status & RxRES)) {
+ if (rtl8169_check_rx_desc_error(dev, tp, status)) {
if (net_ratelimit())
netdev_warn(dev, "Rx ERROR. status = %08x\n",
status);
dev->stats.rx_errors++;
- if (status & (RxRWT | RxRUNT))
- dev->stats.rx_length_errors++;
- if (status & RxCRC)
- dev->stats.rx_crc_errors++;
if (!(dev->features & NETIF_F_RXALL))
goto release_descriptor;
@@ -4838,14 +4987,14 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
goto release_descriptor;
}
- skb = napi_alloc_skb(&tp->rtl8169_napi[0], pkt_size);
+ skb = napi_alloc_skb(&tp->rtl8169_napi[ring->index], pkt_size);
if (unlikely(!skb)) {
dev->stats.rx_dropped++;
goto release_descriptor;
}
- addr = le64_to_cpu(desc->addr);
- rx_buf = page_address(tp->Rx_databuff[entry]);
+ addr = ring->rx_desc_phy_addr[entry];
+ rx_buf = page_address(ring->rx_databuff[entry]);
dma_sync_single_for_cpu(d, addr, pkt_size, DMA_FROM_DEVICE);
prefetch(rx_buf);
@@ -4854,7 +5003,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
skb->len = pkt_size;
dma_sync_single_for_device(d, addr, pkt_size, DMA_FROM_DEVICE);
- rtl8169_rx_csum(skb, status);
+ rtl8169_rx_csum(skb, desc);
skb->protocol = eth_type_trans(skb, dev);
rtl8169_rx_vlan_tag(desc, skb);
@@ -4862,10 +5011,11 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
if (skb->pkt_type == PACKET_MULTICAST)
dev->stats.multicast++;
- napi_gro_receive(&tp->rtl8169_napi[0], skb);
+ napi_gro_receive(&tp->rtl8169_napi[ring->index], skb);
dev_sw_netstats_rx_add(dev, pkt_size);
release_descriptor:
+ rtl8169_set_desc_dma_addr(desc, ring->rx_desc_phy_addr[entry]);
rtl8169_mark_to_asic(desc);
}
@@ -4895,6 +5045,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
phy_mac_interrupt(tp->phydev);
rtl_irq_disable(tp);
+ tp->recheck_desc_ownbit = true;
napi_schedule(napi);
out:
rtl_ack_events(tp, status);
@@ -4972,7 +5123,8 @@ static int rtl8169_poll(struct napi_struct *napi, int budget)
rtl_tx(dev, tp, budget);
- work_done = rtl_rx(dev, tp, budget);
+ for (int i = 0; i < tp->num_rx_rings; i++)
+ work_done += rtl_rx(dev, tp, &tp->rx_ring[i], budget);
if (work_done < budget && napi_complete_done(napi, work_done))
rtl_irq_enable(tp);
@@ -5100,21 +5252,19 @@ static int rtl8169_close(struct net_device *dev)
struct pci_dev *pdev = tp->pci_dev;
pm_runtime_get_sync(&pdev->dev);
-
netif_stop_queue(dev);
rtl8169_down(tp);
- rtl8169_rx_clear(tp);
+ for (int i = 0; i < tp->num_rx_rings; i++)
+ rtl8169_rx_clear(tp, &tp->rx_ring[i]);
rtl8169_free_irq(tp);
phy_disconnect(tp->phydev);
- dma_free_coherent(&pdev->dev, R8169_RX_RING_BYTES, tp->RxDescArray,
- tp->RxPhyAddr);
dma_free_coherent(&pdev->dev, R8169_TX_RING_BYTES, tp->TxDescArray,
tp->TxPhyAddr);
tp->TxDescArray = NULL;
- tp->RxDescArray = NULL;
+ rtl8169_free_rx_desc(tp);
pm_runtime_put_sync(&pdev->dev);
@@ -5145,13 +5295,11 @@ static int rtl_open(struct net_device *dev)
tp->TxDescArray = dma_alloc_coherent(&pdev->dev, R8169_TX_RING_BYTES,
&tp->TxPhyAddr, GFP_KERNEL);
if (!tp->TxDescArray)
- goto out;
-
- tp->RxDescArray = dma_alloc_coherent(&pdev->dev, R8169_RX_RING_BYTES,
- &tp->RxPhyAddr, GFP_KERNEL);
- if (!tp->RxDescArray)
goto err_free_tx_0;
+ if (rtl8169_alloc_rx_desc(tp) < 0)
+ goto err_free_rx_1;
+
retval = rtl8169_init_ring(tp);
if (retval < 0)
goto err_free_rx_1;
@@ -5178,11 +5326,10 @@ static int rtl_open(struct net_device *dev)
rtl8169_free_irq(tp);
err_release_fw_2:
rtl_release_firmware(tp);
- rtl8169_rx_clear(tp);
+ for (int i = 0; i < tp->num_rx_rings; i++)
+ rtl8169_rx_clear(tp, &tp->rx_ring[i]);
err_free_rx_1:
- dma_free_coherent(&pdev->dev, R8169_RX_RING_BYTES, tp->RxDescArray,
- tp->RxPhyAddr);
- tp->RxDescArray = NULL;
+ rtl8169_free_rx_desc(tp);
err_free_tx_0:
dma_free_coherent(&pdev->dev, R8169_TX_RING_BYTES, tp->TxDescArray,
tp->TxPhyAddr);
@@ -5695,7 +5842,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
u32 txconfig;
u32 xid;
- dev = devm_alloc_etherdev(&pdev->dev, sizeof (*tp));
+ dev = devm_alloc_etherdev_mqs(&pdev->dev, sizeof(*tp),
+ R8169_MAX_TX_QUEUES,
+ R8169_MAX_RX_QUEUES);
+
if (!dev)
return -ENOMEM;
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Patch net-next v6 3/7] r8169: add support for new interrupt mapping
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
2026-05-26 8:11 ` [Patch net-next v6 1/7] r8169: add support for multi irqs javen
2026-05-26 8:11 ` [Patch net-next v6 2/7] r8169: add support for multi rx queues javen
@ 2026-05-26 8:11 ` javen
2026-06-02 21:23 ` Heiner Kallweit
2026-05-26 8:11 ` [Patch net-next v6 4/7] r8169: enable " javen
` (4 subsequent siblings)
7 siblings, 1 reply; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
To support RSS, the number of hardware interrupt bits should match the
interrupt of software. So we add support for new interrupt mapping here.
ISR_VER_MAP_REG is the hardware register to indicate interrupt status.
IMR_SET_VEC_MAP_REG is interrupt mask which is set to enable irq.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- no changes
Changes in v3:
- init index in napi_struct and get message_id from index
- move rtl8169_disable_hw_interrupt_msix directly before the call to
napi_schedule()
- change the condition in rtl8169_request_irq when RTL_VEC_MAP_ENABLE
enabled, use rtl8169_interrupt_msix
Changes in v4:
- remove flag tp->feature, replace tp->features & RTL_VEC_MAP_ENABLE
with tp->irq_nvecs > 1, they are equivalent.
- follow reverse xmas tree, in rtl8169_interrupt_msix(),
rtl8169_poll_msix_rx(), rtl8169_poll_msix_tx(),
rtl8169_poll_msix_other()
- use napi->index in rtl8169_poll_msix_other()
- add a comment to describe RTL8127 MSI-X vector layout
- simplify r8169_init_napi()
Changes in v5:
- replace magic number in rtl8169_poll_msix_tx()
Changes in v6:
- when irq_nvecs <= 1, use register IntrMask_8125, else using vec map
- fix irq sequence in rtl8169_interrupt_msix(), disable interrupts
before clean it
- remove dead code in rtl8169_poll_msix_tx()
---
drivers/net/ethernet/realtek/r8169_main.c | 166 +++++++++++++++++++---
1 file changed, 150 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 62bf77aa1ec8..951d2046a81b 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -79,6 +79,7 @@
#define R8169_TX_START_THRS (2 * R8169_TX_STOP_THRS)
#define R8169_MAX_RX_QUEUES 8
#define R8127_MAX_RX_QUEUES 8
+#define R8127_MAX_TX_QUEUES 8
#define R8169_DEFAULT_RX_QUEUES 1
#define R8169_MAX_TX_QUEUES 1
@@ -449,8 +450,12 @@ enum rtl8125_registers {
RSS_CTRL_8125 = 0x4500,
Q_NUM_CTRL_8125 = 0x4800,
EEE_TXIDLE_TIMER_8125 = 0x6048,
+ IMR_CLEAR_VEC_MAP_REG = 0x0d00,
+ ISR_VEC_MAP_REG = 0x0d04,
+ IMR_SET_VEC_MAP_REG = 0x0d0c,
};
+#define MSIX_ID_VEC_MAP_LINKCHG 29
#define LEDSEL_MASK_8125 0x23f
#define RX_VLAN_INNER_8125 BIT(22)
@@ -581,6 +586,9 @@ enum rtl_register_content {
/* magic enable v2 */
MagicPacket_v2 = (1 << 16), /* Wake up when receives a Magic Packet */
+#define ISRIMR_LINKCHG BIT(29)
+#define ISRIMR_TOK_Q0 BIT(8)
+#define ISRIMR_ROK_Q0 BIT(0)
};
enum rtl_desc_bit {
@@ -1664,26 +1672,38 @@ static u32 rtl_get_events(struct rtl8169_private *tp)
static void rtl_ack_events(struct rtl8169_private *tp, u32 bits)
{
- if (rtl_is_8125(tp))
- RTL_W32(tp, IntrStatus_8125, bits);
- else
+ if (rtl_is_8125(tp)) {
+ if (tp->irq_nvecs > 1)
+ RTL_W32(tp, ISR_VEC_MAP_REG, bits);
+ else
+ RTL_W32(tp, IntrStatus_8125, bits);
+ } else {
RTL_W16(tp, IntrStatus, bits);
+ }
}
static void rtl_irq_disable(struct rtl8169_private *tp)
{
- if (rtl_is_8125(tp))
- RTL_W32(tp, IntrMask_8125, 0);
- else
+ if (rtl_is_8125(tp)) {
+ if (tp->irq_nvecs > 1)
+ RTL_W32(tp, IMR_CLEAR_VEC_MAP_REG, 0xffffffff);
+ else
+ RTL_W32(tp, IntrMask_8125, 0);
+ } else {
RTL_W16(tp, IntrMask, 0);
+ }
}
static void rtl_irq_enable(struct rtl8169_private *tp)
{
- if (rtl_is_8125(tp))
- RTL_W32(tp, IntrMask_8125, tp->irq_mask);
- else
+ if (rtl_is_8125(tp)) {
+ if (tp->irq_nvecs > 1)
+ RTL_W32(tp, IMR_SET_VEC_MAP_REG, tp->irq_mask);
+ else
+ RTL_W32(tp, IntrMask_8125, tp->irq_mask);
+ } else {
RTL_W16(tp, IntrMask, tp->irq_mask);
+ }
}
static void rtl8169_irq_mask_and_ack(struct rtl8169_private *tp)
@@ -5062,6 +5082,45 @@ static void rtl8169_free_irq(struct rtl8169_private *tp)
}
}
+static void rtl8169_disable_hw_interrupt_msix(struct rtl8169_private *tp, int message_id)
+{
+ RTL_W32(tp, IMR_CLEAR_VEC_MAP_REG, BIT(message_id));
+}
+
+static void rtl8169_clear_hw_isr(struct rtl8169_private *tp, int message_id)
+{
+ RTL_W32(tp, ISR_VEC_MAP_REG, BIT(message_id));
+}
+
+static void rtl8169_enable_hw_interrupt_msix(struct rtl8169_private *tp, int message_id)
+{
+ RTL_W32(tp, IMR_SET_VEC_MAP_REG, BIT(message_id));
+}
+
+static irqreturn_t rtl8169_interrupt_msix(int irq, void *dev_instance)
+{
+ struct napi_struct *napi = dev_instance;
+ struct net_device *dev = napi->dev;
+ int message_id = napi->index;
+ struct rtl8169_private *tp;
+
+ tp = netdev_priv(dev);
+
+ if (message_id == MSIX_ID_VEC_MAP_LINKCHG) {
+ rtl8169_clear_hw_isr(tp, message_id);
+ phy_mac_interrupt(tp->phydev);
+ return IRQ_HANDLED;
+ }
+
+ rtl8169_disable_hw_interrupt_msix(tp, message_id);
+ rtl8169_clear_hw_isr(tp, message_id);
+
+ tp->recheck_desc_ownbit = true;
+ napi_schedule(napi);
+
+ return IRQ_HANDLED;
+}
+
static int rtl8169_request_irq(struct rtl8169_private *tp)
{
struct net_device *dev = tp->dev;
@@ -5070,8 +5129,12 @@ static int rtl8169_request_irq(struct rtl8169_private *tp)
for (i = 0; i < tp->irq_nvecs; i++) {
napi = &tp->rtl8169_napi[i];
- rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt,
- NULL, napi, "%s-%d", dev->name, i);
+ if (tp->irq_nvecs > 1)
+ rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt_msix,
+ NULL, napi, "%s-%d", dev->name, i);
+ else
+ rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt,
+ NULL, napi, "%s-%d", dev->name, i);
if (rc)
goto free_irq;
}
@@ -5517,10 +5580,16 @@ static const struct net_device_ops rtl_netdev_ops = {
static void rtl_set_irq_mask(struct rtl8169_private *tp)
{
- tp->irq_mask = RxOK | RxErr | TxOK | TxErr | LinkChg;
+ if (tp->irq_nvecs > 1) {
+ tp->irq_mask = ISRIMR_LINKCHG | ISRIMR_TOK_Q0;
+ for (int i = 0; i < tp->num_rx_rings; i++)
+ tp->irq_mask |= ISRIMR_ROK_Q0 << i;
+ } else {
+ tp->irq_mask = RxOK | RxErr | TxOK | TxErr | LinkChg;
- if (tp->mac_version <= RTL_GIGA_MAC_VER_06)
- tp->irq_mask |= SYSErr | RxFIFOOver;
+ if (tp->mac_version <= RTL_GIGA_MAC_VER_06)
+ tp->irq_mask |= SYSErr | RxFIFOOver;
+ }
}
static int rtl_alloc_irq(struct rtl8169_private *tp)
@@ -5825,10 +5894,75 @@ static void r8169_del_napi_action(void *data)
netif_napi_del(&tp->rtl8169_napi[i]);
}
+static int rtl8169_poll_msix_rx(struct napi_struct *napi, int budget)
+{
+ struct net_device *dev = napi->dev;
+ const int message_id = napi->index;
+ struct rtl8169_private *tp;
+ int work_done = 0;
+
+ tp = netdev_priv(dev);
+
+ if (message_id < tp->num_rx_rings)
+ work_done += rtl_rx(dev, tp, &tp->rx_ring[message_id], budget);
+
+ if (work_done < budget && napi_complete_done(napi, work_done))
+ rtl8169_enable_hw_interrupt_msix(tp, message_id);
+
+ return work_done;
+}
+
+static int rtl8169_poll_msix_tx(struct napi_struct *napi, int budget)
+{
+ struct net_device *dev = napi->dev;
+ const int message_id = napi->index;
+ struct rtl8169_private *tp;
+
+ tp = netdev_priv(dev);
+
+ rtl_tx(dev, tp, budget);
+
+ if (napi_complete_done(napi, 0))
+ rtl8169_enable_hw_interrupt_msix(tp, message_id);
+
+ return 0;
+}
+
+static int rtl8169_poll_msix_other(struct napi_struct *napi, int budget)
+{
+ struct net_device *dev = napi->dev;
+ const int message_id = napi->index;
+ struct rtl8169_private *tp;
+
+ tp = netdev_priv(dev);
+
+ if (napi_complete_done(napi, 0))
+ rtl8169_enable_hw_interrupt_msix(tp, message_id);
+
+ return 0;
+}
+
+/* RTL8127 MSI-X vector layout:
+ * Vectors 0 .. (RxQs - 1) : Rx Queues
+ * Vectors RxQs .. (RxQs + TxQs - 1) : Tx Queues
+ * Vector (RxQs + TxQs) and up : Other events (Link status(29), etc.)
+ */
static void r8169_init_napi(struct rtl8169_private *tp)
{
- for (int i = 0; i < tp->irq_nvecs; i++)
- netif_napi_add(tp->dev, &tp->rtl8169_napi[i], rtl8169_poll);
+ for (int i = 0; i < tp->irq_nvecs; i++) {
+ int (*poll_fn)(struct napi_struct *, int) = rtl8169_poll;
+
+ if (tp->irq_nvecs > 1) {
+ if (i < R8127_MAX_RX_QUEUES)
+ poll_fn = rtl8169_poll_msix_rx;
+ else if (i < R8127_MAX_RX_QUEUES + R8127_MAX_TX_QUEUES)
+ poll_fn = rtl8169_poll_msix_tx;
+ else
+ poll_fn = rtl8169_poll_msix_other;
+ }
+ netif_napi_add(tp->dev, &tp->rtl8169_napi[i], poll_fn);
+ tp->rtl8169_napi[i].index = i;
+ }
devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Patch net-next v6 4/7] r8169: enable new interrupt mapping
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
` (2 preceding siblings ...)
2026-05-26 8:11 ` [Patch net-next v6 3/7] r8169: add support for new interrupt mapping javen
@ 2026-05-26 8:11 ` javen
2026-05-26 8:11 ` [Patch net-next v6 5/7] r8169: add support and enable rss javen
` (3 subsequent siblings)
7 siblings, 0 replies; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
This patch enables new interrupt mapping for RTL8127.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- no changes
Changes in v3:
- no changes
Changes in v4:
- no changes
Changes in v5:
- no changes
Changes in v6:
- no changes
---
drivers/net/ethernet/realtek/r8169_main.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 951d2046a81b..13d955324037 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -3939,6 +3939,15 @@ DECLARE_RTL_COND(rtl_mac_ocp_e00e_cond)
return r8168_mac_ocp_read(tp, 0xe00e) & BIT(13);
}
+static void rtl8169_hw_enable_vec_mapping(struct rtl8169_private *tp)
+{
+ u8 tmp;
+
+ tmp = RTL_R8(tp, INT_CFG0_8125);
+ tmp |= INT_CFG0_ENABLE_8125;
+ RTL_W8(tp, INT_CFG0_8125, tmp);
+}
+
static void rtl_hw_start_8125_common(struct rtl8169_private *tp)
{
rtl_pcie_state_l2l3_disable(tp);
@@ -3947,6 +3956,9 @@ static void rtl_hw_start_8125_common(struct rtl8169_private *tp)
RTL_W32(tp, RSS_CTRL_8125, 0);
RTL_W16(tp, Q_NUM_CTRL_8125, 0);
+ if (tp->irq_nvecs > 1)
+ rtl8169_hw_enable_vec_mapping(tp);
+
/* disable UPS */
r8168_mac_ocp_modify(tp, 0xd40a, 0x0010, 0x0000);
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Patch net-next v6 5/7] r8169: add support and enable rss
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
` (3 preceding siblings ...)
2026-05-26 8:11 ` [Patch net-next v6 4/7] r8169: enable " javen
@ 2026-05-26 8:11 ` javen
2026-06-02 21:23 ` Heiner Kallweit
2026-05-26 8:11 ` [Patch net-next v6 6/7] r8169: move struct ethtool_ops javen
` (2 subsequent siblings)
7 siblings, 1 reply; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
This patch adds support and enable rss for RTL8127.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- some changes moved from Patch 2/7
Changes in v3:
- add struct rtl8169_rss_data. Allocate it dynamically when needed.
- define rss_key as an u32 array
- replace some magic bit numbers in rtl8169_set_rss_hash_opt() and
rtl8125_set_rx_q_num()
- use union to combine different rx descriptor, refactor struct RxDesc
- remove dead code from rtl8169_double_check_rss_support()
Changes in v4:
- rename macro definition, e.g R8127_MAX_IRQ to R8127_MAX_NUM_IRQVEC
- change hw_supp_indir_tbl_entries type to unsigned int
- change init_rx_desc_type type to enum
- remove rtl_check_rss_support(), add helper function
rtl_hw_support_rss()
- remove hw_curr_isr_ver, use irq_nvecs to judge whether we should
enable vector interrupt mapping, use tp->num_rx_ring to judge whether
we should enable rss
- remove function rtl8169_double_check_rss_support(), use
rtl8169_set_rx_ring_num() to set num_rx_ring according to tp->irq_nvecs
Changes in v5:
- no changes
Changes in v6:
- change rss_queue_num type from u8 to unsigned int
- fix rx desc clear in rtl8169_rx_clear() for different desc type
- clamping num_rx_ring with rounddown_pow_of_two()
---
drivers/net/ethernet/realtek/r8169_main.c | 397 +++++++++++++++++++---
1 file changed, 358 insertions(+), 39 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 13d955324037..a79a8756516d 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -82,6 +82,19 @@
#define R8127_MAX_TX_QUEUES 8
#define R8169_DEFAULT_RX_QUEUES 1
#define R8169_MAX_TX_QUEUES 1
+#define R8127_MAX_NUM_IRQVEC 32
+#define R8127_MIN_NUM_IRQVEC 30
+#define R8169_IRQ_DEFAULT 1
+#define RTL_RSS_KEY_SIZE 40
+#define RSS_CPU_NUM_MASK GENMASK(18, 16)
+#define RSS_HASH_MASK GENMASK(10, 8)
+#define RTL_MAX_INDIRECTION_TABLE_ENTRIES 128
+#define RXS_RSS_UDP BIT(27)
+#define RXS_RSS_IPV4 BIT(28)
+#define RXS_RSS_IPV6 BIT(29)
+#define RXS_RSS_TCP BIT(30)
+#define RXS_RSS_L3_TYPE_MASK (RXS_RSS_IPV4 | RXS_RSS_IPV6)
+#define RXS_RSS_L4_TYPE_MASK (RXS_RSS_TCP | RXS_RSS_UDP)
#define OCP_STD_PHY_BASE 0xa400
@@ -589,6 +602,25 @@ enum rtl_register_content {
#define ISRIMR_LINKCHG BIT(29)
#define ISRIMR_TOK_Q0 BIT(8)
#define ISRIMR_ROK_Q0 BIT(0)
+#define RTL_DESC_TYPE_CTRL 0xd8
+#define RSS_KEY_REG 0x4600
+#define RSS_INDIRECTION_TBL_REG 0x4700
+#define RSS_CTRL_TCP_IPV4_SUPP BIT(0)
+#define RTL_DESC_TYPE_RSS BIT(1)
+#define RSS_CTRL_IPV4_SUPP BIT(1)
+#define RSS_CTRL_TCP_IPV6_SUPP BIT(2)
+#define RSS_CTRL_IPV6_SUPP BIT(3)
+#define RSS_CTRL_IPV6_EXT_SUPP BIT(4)
+#define RSS_CTRL_TCP_IPV6_EXT_SUPP BIT(5)
+#define RSS_CTRL_UDP_IPV4_SUPP BIT(6)
+#define RSS_CTRL_UDP_IPV6_SUPP BIT(7)
+#define RSS_CTRL_UDP_IPV6_EXT_SUPP BIT(8)
+#define RTL_RSS_FLAG_HASH_UDP_IPV4 BIT(0)
+#define RTL_RSS_FLAG_HASH_UDP_IPV6 BIT(1)
+#define RX_RES_RSS BIT(22)
+#define RX_RUNT_RSS BIT(21)
+#define RX_CRC_RSS BIT(20)
+#define RTL_RX_Q_NUM_MASK GENMASK(4, 2)
};
enum rtl_desc_bit {
@@ -646,6 +678,11 @@ enum rtl_rx_desc_bit {
#define RxProtoIP (PID1 | PID0)
#define RxProtoMask RxProtoIP
+#define RX_UDPT_DESC_RSS BIT(19)
+#define RX_TCPT_DESC_RSS BIT(18)
+#define RX_UDPF_DESC_RSS BIT(16) /* UDP/IP checksum failed */
+#define RX_TCPF_DESC_RSS BIT(15) /* TCP/IP checksum failed */
+
IPFail = (1 << 16), /* IP checksum failed */
UDPFail = (1 << 15), /* UDP/IP checksum failed */
TCPFail = (1 << 14), /* TCP/IP checksum failed */
@@ -667,9 +704,27 @@ struct TxDesc {
};
struct RxDesc {
- __le32 opts1;
- __le32 opts2;
- __le64 addr;
+ union {
+ /* RX_DESC_TYPE_DEFAULT */
+ struct {
+ __le32 opts1;
+ __le32 opts2;
+ __le64 addr;
+ };
+
+ /* RX_DESC_TYPE_RSS */
+ struct {
+ union {
+ __le64 rss_addr;
+ struct {
+ __le32 rss_info;
+ __le32 rss_result;
+ } rss_dword;
+ };
+ __le32 rss_opts2;
+ __le32 rss_opts1;
+ };
+ };
};
struct ring_info {
@@ -741,9 +796,9 @@ enum rtl_dash_type {
RTL_DASH_25_BP,
};
-enum rx_desc_ring_type {
- RX_DESC_RING_TYPE_DEFAULT,
- RX_DESC_RING_TYPE_RSS,
+enum rx_desc_type {
+ RX_DESC_TYPE_DEFAULT,
+ RX_DESC_TYPE_RSS,
};
struct rtl8169_rx_ring {
@@ -756,6 +811,12 @@ struct rtl8169_rx_ring {
struct page *rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
};
+struct rtl8169_rss_data {
+ u32 rss_key[RTL_RSS_KEY_SIZE / sizeof(u32)];
+ u8 rss_indir_tbl[RTL_MAX_INDIRECTION_TABLE_ENTRIES];
+ unsigned int hw_supp_indir_tbl_entries;
+};
+
struct rtl8169_private {
void __iomem *mmio_addr; /* memory map physical address */
struct pci_dev *pci_dev;
@@ -775,7 +836,9 @@ struct rtl8169_private {
u16 tx_lpi_timer;
u32 irq_mask;
unsigned int hw_supp_num_rx_queues;
+ struct rtl8169_rss_data *rss_data;
unsigned int irq_nvecs;
+ enum rx_desc_type init_rx_desc_type;
struct clk *clk;
struct {
@@ -1606,6 +1669,11 @@ static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
}
}
+static bool rtl_hw_support_rss(struct rtl8169_private *tp)
+{
+ return tp->mac_version == RTL_GIGA_MAC_VER_80;
+}
+
static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
{
switch (tp->mac_version) {
@@ -1907,9 +1975,20 @@ static inline u32 rtl8169_tx_vlan_tag(struct sk_buff *skb)
TxVlanTag | swab16(skb_vlan_tag_get(skb)) : 0x00;
}
-static void rtl8169_rx_vlan_tag(struct RxDesc *desc, struct sk_buff *skb)
+static void rtl8169_rx_vlan_tag(struct rtl8169_private *tp,
+ struct RxDesc *desc,
+ struct sk_buff *skb)
{
- u32 opts2 = le32_to_cpu(desc->opts2);
+ u32 opts2;
+
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ opts2 = le32_to_cpu(desc->rss_opts2);
+ break;
+ default:
+ opts2 = le32_to_cpu(desc->opts2);
+ break;
+ }
if (opts2 & RxVlanTag)
__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), swab16(opts2 & 0xffff));
@@ -2738,17 +2817,27 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
rtl_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
}
+static void rtl8169_init_rss(struct rtl8169_private *tp)
+{
+ for (int i = 0; i < tp->rss_data->hw_supp_indir_tbl_entries; i++)
+ tp->rss_data->rss_indir_tbl[i] = ethtool_rxfh_indir_default(i, tp->num_rx_rings);
+
+ netdev_rss_key_fill(tp->rss_data->rss_key, RTL_RSS_KEY_SIZE);
+}
+
static void rtl_software_parameter_initialize(struct rtl8169_private *tp)
{
tp->num_rx_rings = 1;
switch (tp->mac_version) {
case RTL_GIGA_MAC_VER_80:
tp->hw_supp_num_rx_queues = R8127_MAX_RX_QUEUES;
+ tp->rss_data->hw_supp_indir_tbl_entries = RTL_MAX_INDIRECTION_TABLE_ENTRIES;
break;
default:
tp->hw_supp_num_rx_queues = R8169_DEFAULT_RX_QUEUES;
break;
}
+ tp->init_rx_desc_type = RX_DESC_TYPE_DEFAULT;
}
static void rtl_request_firmware(struct rtl8169_private *tp)
@@ -2873,6 +2962,64 @@ static void rtl_set_rx_max_size(struct rtl8169_private *tp)
RTL_W16(tp, RxMaxSize, R8169_RX_BUF_SIZE + 1);
}
+static void rtl8169_store_rss_key(struct rtl8169_private *tp)
+{
+ u32 num_entries = RTL_RSS_KEY_SIZE / sizeof(u32);
+ u32 *rss_key = tp->rss_data->rss_key;
+ const u16 rss_key_reg = RSS_KEY_REG;
+
+ /* Write redirection table to HW */
+ for (int i = 0; i < num_entries; i++)
+ RTL_W32(tp, rss_key_reg + (i * 4), rss_key[i]);
+}
+
+static void rtl8169_store_reta(struct rtl8169_private *tp)
+{
+ u32 i, reta_entries = tp->rss_data->hw_supp_indir_tbl_entries;
+ u16 indir_tbl_reg = RSS_INDIRECTION_TBL_REG;
+ u8 *indir_tbl = tp->rss_data->rss_indir_tbl;
+ u32 reta = 0;
+
+ /* Write redirection table to HW */
+ for (i = 0; i < reta_entries; i++) {
+ reta |= indir_tbl[i] << (i & 0x3) * 8;
+ if ((i & 3) == 3) {
+ RTL_W32(tp, indir_tbl_reg, reta);
+ indir_tbl_reg += 4;
+ reta = 0;
+ }
+ }
+}
+
+static int rtl8169_set_rss_hash_opt(struct rtl8169_private *tp)
+{
+ u32 rss_ctrl;
+
+ rss_ctrl = FIELD_PREP(RSS_CPU_NUM_MASK, ilog2(tp->num_rx_rings));
+
+ /* Perform hash on these packet types */
+ rss_ctrl |= RSS_CTRL_TCP_IPV4_SUPP
+ | RSS_CTRL_IPV4_SUPP
+ | RSS_CTRL_IPV6_SUPP
+ | RSS_CTRL_IPV6_EXT_SUPP
+ | RSS_CTRL_TCP_IPV6_SUPP
+ | RSS_CTRL_TCP_IPV6_EXT_SUPP;
+
+ rss_ctrl |= FIELD_PREP(RSS_HASH_MASK,
+ ilog2(tp->rss_data->hw_supp_indir_tbl_entries));
+
+ RTL_W32(tp, RSS_CTRL_8125, rss_ctrl);
+
+ return 0;
+}
+
+static void rtl_set_rss_config(struct rtl8169_private *tp)
+{
+ rtl8169_set_rss_hash_opt(tp);
+ rtl8169_store_reta(tp);
+ rtl8169_store_rss_key(tp);
+}
+
static void rtl_set_rx_tx_desc_registers(struct rtl8169_private *tp)
{
struct rtl8169_rx_ring *ring = &tp->rx_ring[0];
@@ -3939,6 +4086,18 @@ DECLARE_RTL_COND(rtl_mac_ocp_e00e_cond)
return r8168_mac_ocp_read(tp, 0xe00e) & BIT(13);
}
+static void rtl8125_set_rx_q_num(struct rtl8169_private *tp)
+{
+ u16 rx_q_num;
+ u16 q_ctrl;
+
+ rx_q_num = (u16)ilog2(tp->num_rx_rings);
+ q_ctrl = RTL_R16(tp, Q_NUM_CTRL_8125);
+ q_ctrl &= ~RTL_RX_Q_NUM_MASK;
+ q_ctrl |= FIELD_PREP(RTL_RX_Q_NUM_MASK, rx_q_num);
+ RTL_W16(tp, Q_NUM_CTRL_8125, q_ctrl);
+}
+
static void rtl8169_hw_enable_vec_mapping(struct rtl8169_private *tp)
{
u8 tmp;
@@ -3978,6 +4137,13 @@ static void rtl_hw_start_8125_common(struct rtl8169_private *tp)
tp->mac_version == RTL_GIGA_MAC_VER_80)
RTL_W8(tp, 0xD8, RTL_R8(tp, 0xD8) & ~0x02);
+ /* enable rx descriptor type v4 and set queue num for rss*/
+ if (tp->num_rx_rings > 1) {
+ rtl8125_set_rx_q_num(tp);
+ RTL_W8(tp, RTL_DESC_TYPE_CTRL,
+ RTL_R8(tp, RTL_DESC_TYPE_CTRL) | RTL_DESC_TYPE_RSS);
+ }
+
if (tp->mac_version == RTL_GIGA_MAC_VER_80)
r8168_mac_ocp_modify(tp, 0xe614, 0x0f00, 0x0f00);
else if (tp->mac_version == RTL_GIGA_MAC_VER_70)
@@ -4214,6 +4380,12 @@ static void rtl_hw_start(struct rtl8169_private *tp)
rtl_hw_aspm_clkreq_enable(tp, true);
rtl_set_rx_max_size(tp);
rtl_set_rx_tx_desc_registers(tp);
+ if (rtl_is_8125(tp)) {
+ if (tp->num_rx_rings > 1)
+ rtl_set_rss_config(tp);
+ else
+ RTL_W32(tp, RSS_CTRL_8125, 0x00);
+ }
rtl_lock_config_regs(tp);
rtl_jumbo_config(tp);
@@ -4241,14 +4413,26 @@ static int rtl8169_change_mtu(struct net_device *dev, int new_mtu)
return 0;
}
-static void rtl8169_mark_to_asic(struct RxDesc *desc)
+static void rtl8169_mark_to_asic(struct rtl8169_private *tp, struct RxDesc *desc)
{
- u32 eor = le32_to_cpu(desc->opts1) & RingEnd;
+ u32 eor;
- desc->opts2 = 0;
- /* Force memory writes to complete before releasing descriptor */
- dma_wmb();
- WRITE_ONCE(desc->opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ eor = le32_to_cpu(desc->rss_opts1) & RingEnd;
+ desc->rss_opts2 = cpu_to_le32(0);
+ /* Force memory writes to complete before releasing descriptor */
+ dma_wmb();
+ WRITE_ONCE(desc->rss_opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
+ break;
+ default:
+ eor = le32_to_cpu(desc->opts1) & RingEnd;
+ desc->opts2 = cpu_to_le32(0);
+ /* Force memory writes to complete before releasing descriptor */
+ dma_wmb();
+ WRITE_ONCE(desc->opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
+ break;
+ }
}
static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
@@ -4271,9 +4455,12 @@ static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
return NULL;
}
- desc->addr = cpu_to_le64(mapping);
ring->rx_desc_phy_addr[index] = mapping;
- rtl8169_mark_to_asic(desc);
+ if (tp->init_rx_desc_type == RX_DESC_TYPE_RSS)
+ desc->rss_addr = cpu_to_le64(mapping);
+ else
+ desc->addr = cpu_to_le64(mapping);
+ rtl8169_mark_to_asic(tp, desc);
return data;
}
@@ -4289,8 +4476,25 @@ static void rtl8169_rx_clear(struct rtl8169_private *tp, struct rtl8169_rx_ring
__free_pages(ring->rx_databuff[i], get_order(R8169_RX_BUF_SIZE));
ring->rx_databuff[i] = NULL;
ring->rx_desc_phy_addr[i] = 0;
- ring->rx_desc_array[i].addr = 0;
- ring->rx_desc_array[i].opts1 = 0;
+ if (tp->init_rx_desc_type == RX_DESC_TYPE_RSS) {
+ ring->rx_desc_array[i].rss_addr = 0;
+ ring->rx_desc_array[i].rss_opts1 = 0;
+ } else {
+ ring->rx_desc_array[i].addr = 0;
+ ring->rx_desc_array[i].opts1 = 0;
+ }
+ }
+}
+
+static void rtl8169_mark_as_last_descriptor(struct rtl8169_private *tp, struct RxDesc *desc)
+{
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ desc->rss_opts1 |= cpu_to_le32(RingEnd);
+ break;
+ default:
+ desc->opts1 |= cpu_to_le32(RingEnd);
+ break;
}
}
@@ -4310,7 +4514,7 @@ static int rtl8169_rx_fill(struct rtl8169_private *tp, struct rtl8169_rx_ring *r
}
/* mark as last descriptor in the ring */
- ring->rx_desc_array[NUM_RX_DESC - 1].opts1 |= cpu_to_le32(RingEnd);
+ rtl8169_mark_as_last_descriptor(tp, &ring->rx_desc_array[NUM_RX_DESC - 1]);
return 0;
}
@@ -4466,7 +4670,7 @@ static void rtl8169_rx_desc_reset(struct rtl8169_private *tp)
struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
for (int j = 0; j < NUM_RX_DESC; j++)
- rtl8169_mark_to_asic(ring->rx_desc_array + j);
+ rtl8169_mark_to_asic(tp, ring->rx_desc_array + j);
}
}
@@ -4922,35 +5126,104 @@ static inline int rtl8169_fragmented_frame(u32 status)
return (status & (FirstFrag | LastFrag)) != (FirstFrag | LastFrag);
}
-static inline void rtl8169_rx_csum(struct sk_buff *skb,
+static inline void rtl8169_rx_hash(struct rtl8169_private *tp,
+ struct RxDesc *desc,
+ struct sk_buff *skb)
+{
+ u32 rss_header_info;
+ u32 hash_val;
+
+ if (!(tp->dev->features & NETIF_F_RXHASH))
+ return;
+
+ rss_header_info = le32_to_cpu(desc->rss_dword.rss_info);
+
+ if (!(rss_header_info & RXS_RSS_L3_TYPE_MASK))
+ return;
+
+ hash_val = le32_to_cpu(desc->rss_dword.rss_result);
+
+ skb_set_hash(skb, hash_val,
+ (RXS_RSS_L4_TYPE_MASK & rss_header_info) ?
+ PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
+}
+
+static inline void rtl8169_rx_csum(struct rtl8169_private *tp,
+ struct sk_buff *skb,
struct RxDesc *desc)
{
- u32 status = le32_to_cpu(desc->opts1) & (RxProtoMask | RxCSFailMask);
+ bool csum_ok = false;
+ u32 opts1;
- if (status == RxProtoTCP || status == RxProtoUDP)
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ opts1 = le32_to_cpu(desc->rss_opts1);
+ if (((opts1 & RX_TCPT_DESC_RSS) && !(opts1 & RX_TCPF_DESC_RSS)) ||
+ ((opts1 & RX_UDPT_DESC_RSS) && !(opts1 & RX_UDPF_DESC_RSS)))
+ csum_ok = true;
+ break;
+ default:
+ opts1 = le32_to_cpu(desc->opts1) & (RxProtoMask | RxCSFailMask);
+ if (opts1 == RxProtoTCP || opts1 == RxProtoUDP)
+ csum_ok = true;
+ break;
+ }
+
+ if (csum_ok)
skb->ip_summed = CHECKSUM_UNNECESSARY;
else
skb_checksum_none_assert(skb);
}
+static __le32 rtl8169_rx_desc_opts1(struct rtl8169_private *tp, struct RxDesc *desc)
+{
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ return READ_ONCE(desc->rss_opts1);
+ default:
+ return READ_ONCE(desc->opts1);
+ }
+}
+
static bool rtl8169_check_rx_desc_error(struct net_device *dev,
struct rtl8169_private *tp,
u32 status)
{
- if (unlikely(status & RxRES)) {
- if (status & (RxRWT | RxRUNT))
- dev->stats.rx_length_errors++;
- if (status & RxCRC)
- dev->stats.rx_crc_errors++;
- return true;
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ if (unlikely(status & RX_RES_RSS)) {
+ if (status & RX_RUNT_RSS)
+ dev->stats.rx_length_errors++;
+ if (status & RX_CRC_RSS)
+ dev->stats.rx_crc_errors++;
+ return true;
+ }
+ break;
+ default:
+ if (unlikely(status & RxRES)) {
+ if (status & (RxRWT | RxRUNT))
+ dev->stats.rx_length_errors++;
+ if (status & RxCRC)
+ dev->stats.rx_crc_errors++;
+ return true;
+ }
+ break;
}
return false;
}
-static void rtl8169_set_desc_dma_addr(struct RxDesc *desc,
+static void rtl8169_set_desc_dma_addr(struct rtl8169_private *tp,
+ struct RxDesc *desc,
dma_addr_t mapping)
{
- desc->addr = cpu_to_le64(mapping);
+ switch (tp->init_rx_desc_type) {
+ case RX_DESC_TYPE_RSS:
+ desc->rss_addr = cpu_to_le64(mapping);
+ break;
+ default:
+ desc->addr = cpu_to_le64(mapping);
+ break;
+ }
}
static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
@@ -4967,7 +5240,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
dma_addr_t addr;
u32 status;
- status = le32_to_cpu(READ_ONCE(desc->opts1));
+ status = le32_to_cpu(rtl8169_rx_desc_opts1(tp, desc));
if (status & DescOwn) {
if (!tp->recheck_desc_ownbit)
@@ -4983,7 +5256,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
*/
tp->recheck_desc_ownbit = false;
RTL_R8(tp, LED_CTRL);
- status = le32_to_cpu(READ_ONCE(desc->opts1));
+ status = le32_to_cpu(rtl8169_rx_desc_opts1(tp, desc));
if (status & DescOwn)
break;
}
@@ -5034,11 +5307,12 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
skb->tail += pkt_size;
skb->len = pkt_size;
dma_sync_single_for_device(d, addr, pkt_size, DMA_FROM_DEVICE);
-
- rtl8169_rx_csum(skb, desc);
+ if (tp->num_rx_rings > 1)
+ rtl8169_rx_hash(tp, desc, skb);
+ rtl8169_rx_csum(tp, skb, desc);
skb->protocol = eth_type_trans(skb, dev);
- rtl8169_rx_vlan_tag(desc, skb);
+ rtl8169_rx_vlan_tag(tp, desc, skb);
if (skb->pkt_type == PACKET_MULTICAST)
dev->stats.multicast++;
@@ -5047,8 +5321,8 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
dev_sw_netstats_rx_add(dev, pkt_size);
release_descriptor:
- rtl8169_set_desc_dma_addr(desc, ring->rx_desc_phy_addr[entry]);
- rtl8169_mark_to_asic(desc);
+ rtl8169_set_desc_dma_addr(tp, desc, ring->rx_desc_phy_addr[entry]);
+ rtl8169_mark_to_asic(tp, desc);
}
return count;
@@ -5604,6 +5878,32 @@ static void rtl_set_irq_mask(struct rtl8169_private *tp)
}
}
+static int get_max_irq_nvecs(struct rtl8169_private *tp)
+{
+ if (tp->mac_version == RTL_GIGA_MAC_VER_80)
+ return R8127_MAX_NUM_IRQVEC;
+ return R8169_IRQ_DEFAULT;
+}
+
+static int get_min_irq_nvecs(struct rtl8169_private *tp)
+{
+ if (tp->mac_version == RTL_GIGA_MAC_VER_80)
+ return R8127_MIN_NUM_IRQVEC;
+ return R8169_IRQ_DEFAULT;
+}
+
+static void rtl8169_set_rx_ring_num(struct rtl8169_private *tp)
+{
+ if (tp->irq_nvecs >= get_min_irq_nvecs(tp)) {
+ unsigned int rss_queue_num = netif_get_num_default_rss_queues();
+
+ tp->num_rx_rings = rounddown_pow_of_two(min(rss_queue_num,
+ tp->hw_supp_num_rx_queues));
+ if (tp->num_rx_rings >= 2)
+ tp->init_rx_desc_type = RX_DESC_TYPE_RSS;
+ }
+}
+
static int rtl_alloc_irq(struct rtl8169_private *tp)
{
struct pci_dev *pdev = tp->pci_dev;
@@ -5624,7 +5924,10 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
break;
}
- nvecs = pci_alloc_irq_vectors(pdev, 1, 1, flags);
+ nvecs = pci_alloc_irq_vectors(pdev, get_min_irq_nvecs(tp), get_max_irq_nvecs(tp), flags);
+
+ if (nvecs < 0)
+ nvecs = pci_alloc_irq_vectors(pdev, 1, 1, flags);
if (nvecs < 0)
return nvecs;
@@ -6071,6 +6374,12 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
tp->dash_type = rtl_get_dash_type(tp);
tp->dash_enabled = rtl_dash_is_enabled(tp);
+ if (rtl_hw_support_rss(tp)) {
+ tp->rss_data = devm_kzalloc(&pdev->dev, sizeof(*tp->rss_data), GFP_KERNEL);
+ if (!tp->rss_data)
+ return -ENOMEM;
+ }
+
tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
if (sizeof(dma_addr_t) > 4 && tp->mac_version >= RTL_GIGA_MAC_VER_18 &&
@@ -6096,6 +6405,11 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
if (!tp->rtl8169_napi)
return -ENOMEM;
+ rtl8169_set_rx_ring_num(tp);
+
+ if (rtl_hw_support_rss(tp))
+ rtl8169_init_rss(tp);
+
INIT_WORK(&tp->wk.work, rtl_task);
disable_work(&tp->wk.work);
@@ -6110,6 +6424,11 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
+ if (rtl_hw_support_rss(tp) && tp->num_rx_rings > 1) {
+ dev->hw_features |= NETIF_F_RXHASH;
+ dev->features |= NETIF_F_RXHASH;
+ }
+
/*
* Pretend we are using VLANs; This bypasses a nasty bug where
* Interrupts stop flowing on high load on 8110SCd controllers.
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Patch net-next v6 6/7] r8169: move struct ethtool_ops
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
` (4 preceding siblings ...)
2026-05-26 8:11 ` [Patch net-next v6 5/7] r8169: add support and enable rss javen
@ 2026-05-26 8:11 ` javen
2026-05-26 8:11 ` [Patch net-next v6 7/7] r8169: support setting rx queue numbers via ethtool javen
2026-06-01 2:14 ` [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 Javen
7 siblings, 0 replies; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
The patch moves the rtl8169_ethtool_ops definition further down in
r8169_main.c so that subsequent additions of rtl8169_get_channels and
rtl8169_set_channels can be referenced from the ops struct without
needing forward declarations.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- no changes
Changes in v3:
- no changes
Changes in v4:
- no changes
Changes in v5:
- no changes
Changes in v6:
- modify commit message
---
drivers/net/ethernet/realtek/r8169_main.c | 56 +++++++++++------------
1 file changed, 28 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index a79a8756516d..bf031f09437f 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -2535,34 +2535,6 @@ static int rtl8169_set_link_ksettings(struct net_device *ndev,
return 0;
}
-static const struct ethtool_ops rtl8169_ethtool_ops = {
- .supported_coalesce_params = ETHTOOL_COALESCE_USECS |
- ETHTOOL_COALESCE_MAX_FRAMES,
- .get_drvinfo = rtl8169_get_drvinfo,
- .get_regs_len = rtl8169_get_regs_len,
- .get_link = ethtool_op_get_link,
- .get_coalesce = rtl_get_coalesce,
- .set_coalesce = rtl_set_coalesce,
- .get_regs = rtl8169_get_regs,
- .get_wol = rtl8169_get_wol,
- .set_wol = rtl8169_set_wol,
- .get_strings = rtl8169_get_strings,
- .get_sset_count = rtl8169_get_sset_count,
- .get_ethtool_stats = rtl8169_get_ethtool_stats,
- .get_ts_info = ethtool_op_get_ts_info,
- .nway_reset = phy_ethtool_nway_reset,
- .get_eee = rtl8169_get_eee,
- .set_eee = rtl8169_set_eee,
- .get_link_ksettings = phy_ethtool_get_link_ksettings,
- .set_link_ksettings = rtl8169_set_link_ksettings,
- .get_ringparam = rtl8169_get_ringparam,
- .get_pause_stats = rtl8169_get_pause_stats,
- .get_pauseparam = rtl8169_get_pauseparam,
- .set_pauseparam = rtl8169_set_pauseparam,
- .get_eth_mac_stats = rtl8169_get_eth_mac_stats,
- .get_eth_ctrl_stats = rtl8169_get_eth_ctrl_stats,
-};
-
static const struct rtl_chip_info *rtl8169_get_chip_version(u32 xid, bool gmii)
{
/* Chips combining a 1Gbps MAC with a 100Mbps PHY */
@@ -6281,6 +6253,34 @@ static void r8169_init_napi(struct rtl8169_private *tp)
devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
}
+static const struct ethtool_ops rtl8169_ethtool_ops = {
+ .supported_coalesce_params = ETHTOOL_COALESCE_USECS |
+ ETHTOOL_COALESCE_MAX_FRAMES,
+ .get_drvinfo = rtl8169_get_drvinfo,
+ .get_regs_len = rtl8169_get_regs_len,
+ .get_link = ethtool_op_get_link,
+ .get_coalesce = rtl_get_coalesce,
+ .set_coalesce = rtl_set_coalesce,
+ .get_regs = rtl8169_get_regs,
+ .get_wol = rtl8169_get_wol,
+ .set_wol = rtl8169_set_wol,
+ .get_strings = rtl8169_get_strings,
+ .get_sset_count = rtl8169_get_sset_count,
+ .get_ethtool_stats = rtl8169_get_ethtool_stats,
+ .get_ts_info = ethtool_op_get_ts_info,
+ .nway_reset = phy_ethtool_nway_reset,
+ .get_eee = rtl8169_get_eee,
+ .set_eee = rtl8169_set_eee,
+ .get_link_ksettings = phy_ethtool_get_link_ksettings,
+ .set_link_ksettings = rtl8169_set_link_ksettings,
+ .get_ringparam = rtl8169_get_ringparam,
+ .get_pause_stats = rtl8169_get_pause_stats,
+ .get_pauseparam = rtl8169_get_pauseparam,
+ .set_pauseparam = rtl8169_set_pauseparam,
+ .get_eth_mac_stats = rtl8169_get_eth_mac_stats,
+ .get_eth_ctrl_stats = rtl8169_get_eth_ctrl_stats,
+};
+
static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
{
const struct rtl_chip_info *chip;
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Patch net-next v6 7/7] r8169: support setting rx queue numbers via ethtool
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
` (5 preceding siblings ...)
2026-05-26 8:11 ` [Patch net-next v6 6/7] r8169: move struct ethtool_ops javen
@ 2026-05-26 8:11 ` javen
2026-06-01 2:14 ` [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 Javen
7 siblings, 0 replies; 18+ messages in thread
From: javen @ 2026-05-26 8:11 UTC (permalink / raw)
To: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, kuba,
pabeni, horms
Cc: netdev, linux-kernel, Javen Xu
From: Javen Xu <javen_xu@realsil.com.cn>
This patch add support for changing rx queues by ethtool. We can set rx
1, 2, 4, 8 by ethtool -L eth1 rx num.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
---
Changes in v2:
- no changes
Changes in v3:
- no changes
Changes in v4:
- remove rss_support and rss_enable
- remove some zero-initialized
- use kzalloc_objs instead of kcalloc
Changes in v5:
- no changes
Changes in v6:
- change subject of this patch
- defer the assignment of tp->init_rx_desc_type until after
rtl8169_down()
- call netif_set_real_num_rx_queues() to synchronize the new rx queue
number with networking core
---
drivers/net/ethernet/realtek/r8169_main.c | 125 ++++++++++++++++++++++
1 file changed, 125 insertions(+)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index bf031f09437f..039465cd7ee1 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -6253,6 +6253,129 @@ static void r8169_init_napi(struct rtl8169_private *tp)
devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
}
+static void rtl8169_get_channels(struct net_device *dev,
+ struct ethtool_channels *ch)
+{
+ struct rtl8169_private *tp = netdev_priv(dev);
+
+ ch->max_rx = tp->hw_supp_num_rx_queues;
+ ch->max_tx = 1;
+
+ ch->rx_count = tp->num_rx_rings;
+ ch->tx_count = 1;
+}
+
+static int rtl8169_realloc_rx(struct rtl8169_private *tp,
+ struct rtl8169_rx_ring *new_rx,
+ int new_count)
+{
+ int i, ret;
+
+ for (i = 0; i < new_count; i++) {
+ struct rtl8169_rx_ring *ring = &new_rx[i];
+
+ ring->rx_desc_array = dma_alloc_coherent(&tp->pci_dev->dev,
+ R8169_RX_RING_BYTES,
+ &ring->rx_phy_addr,
+ GFP_KERNEL);
+ if (!ring->rx_desc_array) {
+ ret = -ENOMEM;
+ goto err_free;
+ }
+
+ memset(ring->rx_databuff, 0, sizeof(ring->rx_databuff));
+ ret = rtl8169_rx_fill(tp, ring);
+ if (ret) {
+ dma_free_coherent(&tp->pci_dev->dev, R8169_RX_RING_BYTES,
+ ring->rx_desc_array, ring->rx_phy_addr);
+ goto err_free;
+ }
+ }
+ return 0;
+
+err_free:
+ while (--i >= 0) {
+ rtl8169_rx_clear(tp, &new_rx[i]);
+ dma_free_coherent(&tp->pci_dev->dev, R8169_RX_RING_BYTES,
+ new_rx[i].rx_desc_array, new_rx[i].rx_phy_addr);
+ }
+ return ret;
+}
+
+static int rtl8169_set_channels(struct net_device *dev,
+ struct ethtool_channels *ch)
+{
+ struct rtl8169_private *tp = netdev_priv(dev);
+ bool if_running = netif_running(dev);
+ enum rx_desc_type old_rx_desc_type;
+ enum rx_desc_type new_desc_type;
+ struct rtl8169_rx_ring *new_rx;
+ int i, ret;
+
+ old_rx_desc_type = tp->init_rx_desc_type;
+
+ if (!rtl_hw_support_rss(tp)) {
+ netdev_warn(dev, "This chip does not support multiple channels/RSS.\n");
+ return -EOPNOTSUPP;
+ }
+
+ if (ch->rx_count > R8169_MAX_RX_QUEUES)
+ return -EINVAL;
+
+ new_desc_type = ch->rx_count > 1 ? RX_DESC_TYPE_RSS : RX_DESC_TYPE_DEFAULT;
+
+ if (!if_running) {
+ tp->num_rx_rings = ch->rx_count;
+ tp->init_rx_desc_type = new_desc_type;
+ return 0;
+ }
+
+ netif_stop_queue(dev);
+ rtl8169_down(tp);
+
+ new_rx = kzalloc_objs(*new_rx, R8169_MAX_RX_QUEUES);
+ if (!new_rx)
+ return -ENOMEM;
+
+ tp->init_rx_desc_type = new_desc_type;
+ ret = rtl8169_realloc_rx(tp, new_rx, ch->rx_count);
+ if (ret) {
+ tp->init_rx_desc_type = old_rx_desc_type;
+ kfree(new_rx);
+ return ret;
+ }
+
+ for (i = 0; i < tp->num_rx_rings; i++)
+ rtl8169_rx_clear(tp, &tp->rx_ring[i]);
+ rtl8169_free_rx_desc(tp);
+
+ tp->num_rx_rings = ch->rx_count;
+
+ memset(tp->rx_ring, 0, sizeof(tp->rx_ring));
+ memcpy(tp->rx_ring, new_rx, sizeof(*new_rx) * ch->rx_count);
+
+ for (i = 0; i < tp->rss_data->hw_supp_indir_tbl_entries; i++) {
+ if (ch->rx_count > 1)
+ tp->rss_data->rss_indir_tbl[i] =
+ ethtool_rxfh_indir_default(i, tp->num_rx_rings);
+ else
+ tp->rss_data->rss_indir_tbl[i] = 0;
+ }
+
+ rtl_set_irq_mask(tp);
+
+ rtl8169_up(tp);
+ netif_start_queue(dev);
+
+ ret = netif_set_real_num_rx_queues(dev, ch->rx_count);
+ if (ret)
+ netdev_warn(dev, "Failed to set real num rx queues\n");
+
+ kfree(new_rx);
+
+ return 0;
+}
+
static const struct ethtool_ops rtl8169_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
ETHTOOL_COALESCE_MAX_FRAMES,
@@ -6271,6 +6394,8 @@ static const struct ethtool_ops rtl8169_ethtool_ops = {
.nway_reset = phy_ethtool_nway_reset,
.get_eee = rtl8169_get_eee,
.set_eee = rtl8169_set_eee,
+ .get_channels = rtl8169_get_channels,
+ .set_channels = rtl8169_set_channels,
.get_link_ksettings = phy_ethtool_get_link_ksettings,
.set_link_ksettings = rtl8169_set_link_ksettings,
.get_ringparam = rtl8169_get_ringparam,
--
2.43.0
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 1/7] r8169: add support for multi irqs
2026-05-26 8:11 ` [Patch net-next v6 1/7] r8169: add support for multi irqs javen
@ 2026-05-29 1:00 ` Jakub Kicinski
2026-05-29 5:43 ` Javen
2026-06-02 21:22 ` Heiner Kallweit
1 sibling, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-05-29 1:00 UTC (permalink / raw)
To: javen
Cc: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, pabeni,
horms, netdev, linux-kernel
On Tue, 26 May 2026 16:11:11 +0800 javen wrote:
> @@ -4820,7 +4838,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
> goto release_descriptor;
> }
>
> - skb = napi_alloc_skb(&tp->napi, pkt_size);
> + skb = napi_alloc_skb(&tp->rtl8169_napi[0], pkt_size);
the caller is the NAPI poll function, you should pass that NAPI
as arg to rtl_rx() already instead of hardcoding [0] in this patch.
> if (unlikely(!skb)) {
> dev->stats.rx_dropped++;
> goto release_descriptor;
> @@ -4844,7 +4862,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
> if (skb->pkt_type == PACKET_MULTICAST)
> dev->stats.multicast++;
>
> - napi_gro_receive(&tp->napi, skb);
> + napi_gro_receive(&tp->rtl8169_napi[0], skb);
>
> dev_sw_netstats_rx_add(dev, pkt_size);
> release_descriptor:
> +static int rtl8169_set_real_num_queues(struct rtl8169_private *tp)
> +{
> + int ret;
> +
> + ret = netif_set_real_num_tx_queues(tp->dev, 1);
> + if (ret < 0)
> + return ret;
> +
> + return netif_set_real_num_rx_queues(tp->dev, tp->num_rx_rings);
netif_set_real_num_queues() exists, just call it directly instead of
adding your own helper.
> +}
> +
> static int rtl_jumbo_max(struct rtl8169_private *tp)
> {
> /* Non-GBit versions don't support jumbo frames */
> @@ -5599,6 +5669,22 @@ static bool rtl_aspm_is_safe(struct rtl8169_private *tp)
> return false;
> }
>
> +static void r8169_del_napi_action(void *data)
> +{
> + struct rtl8169_private *tp = data;
> + int i;
> +
> + for (i = 0; i < tp->irq_nvecs; i++)
> + netif_napi_del(&tp->rtl8169_napi[i]);
> +}
> +
> +static void r8169_init_napi(struct rtl8169_private *tp)
> +{
> + for (int i = 0; i < tp->irq_nvecs; i++)
> + netif_napi_add(tp->dev, &tp->rtl8169_napi[i], rtl8169_poll);
> + devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
devm_add_action_or_reset() can fail (as the AI bots point out)
but this whole devm_ dance is entirely unnecessary
networking stack will automatically delete NAPI instances when device
is unregistered.
--
pw-bot: cr
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 2/7] r8169: add support for multi rx queues
2026-05-26 8:11 ` [Patch net-next v6 2/7] r8169: add support for multi rx queues javen
@ 2026-05-29 1:04 ` Jakub Kicinski
2026-05-29 6:47 ` Javen
0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2026-05-29 1:04 UTC (permalink / raw)
To: javen
Cc: hkallweit1, nic_swsd, andrew+netdev, davem, edumazet, pabeni,
horms, netdev, linux-kernel
On Tue, 26 May 2026 16:11:12 +0800 javen wrote:
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 22e843baffc7..62bf77aa1ec8 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -74,9 +74,13 @@
> #define NUM_TX_DESC 256 /* Number of Tx descriptor registers */
> #define NUM_RX_DESC 256 /* Number of Rx descriptor registers */
> #define R8169_TX_RING_BYTES (NUM_TX_DESC * sizeof(struct TxDesc))
> -#define R8169_RX_RING_BYTES (NUM_RX_DESC * sizeof(struct RxDesc))
> +#define R8169_RX_RING_BYTES ((NUM_RX_DESC + 1) * sizeof(struct RxDesc))
AI bots are asking why the "+ 1"?
> #define R8169_TX_STOP_THRS (MAX_SKB_FRAGS + 1)
> #define R8169_TX_START_THRS (2 * R8169_TX_STOP_THRS)
> +#define R8169_MAX_RX_QUEUES 8
> +#define R8127_MAX_RX_QUEUES 8
> +#define R8169_DEFAULT_RX_QUEUES 1
> +#define R8169_MAX_TX_QUEUES 1
>
> #define OCP_STD_PHY_BASE 0xa400
>
> @@ -441,6 +445,7 @@ enum rtl8125_registers {
> TxPoll_8125 = 0x90,
> LEDSEL3 = 0x96,
> MAC0_BKP = 0x19e0,
> + RDSAR_Q1_LOW = 0x4000,
> RSS_CTRL_8125 = 0x4500,
> Q_NUM_CTRL_8125 = 0x4800,
> EEE_TXIDLE_TIMER_8125 = 0x6048,
> @@ -728,6 +733,21 @@ enum rtl_dash_type {
> RTL_DASH_25_BP,
> };
>
> +enum rx_desc_ring_type {
> + RX_DESC_RING_TYPE_DEFAULT,
> + RX_DESC_RING_TYPE_RSS,
> +};
> +
> +struct rtl8169_rx_ring {
> + u32 index; /* Rx queue index */
> + u32 cur_rx; /* Index of next Rx pkt. */
> + u32 dirty_rx; /* Index for recycling. */
> + struct RxDesc *rx_desc_array; /* array of Rx Desc*/
> + dma_addr_t rx_desc_phy_addr[NUM_RX_DESC]; /* Rx data buffer physical dma address */
> + dma_addr_t rx_phy_addr; /* Rx desc physical address */
> + struct page *rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
> +};
> +
> struct rtl8169_private {
> void __iomem *mmio_addr; /* memory map physical address */
> struct pci_dev *pci_dev;
> @@ -735,20 +755,18 @@ struct rtl8169_private {
> struct phy_device *phydev;
> enum mac_version mac_version;
> enum rtl_dash_type dash_type;
> - u32 cur_rx; /* Index into the Rx descriptor buffer of next Rx pkt. */
> u32 cur_tx; /* Index into the Tx descriptor buffer of next Rx pkt. */
> u32 dirty_tx;
> struct TxDesc *TxDescArray; /* 256-aligned Tx descriptor ring */
> - struct RxDesc *RxDescArray; /* 256-aligned Rx descriptor ring */
> dma_addr_t TxPhyAddr;
> - dma_addr_t RxPhyAddr;
> - struct page *Rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
> struct ring_info tx_skb[NUM_TX_DESC]; /* Tx data buffers */
> struct napi_struct *rtl8169_napi;
> + struct rtl8169_rx_ring rx_ring[R8169_MAX_RX_QUEUES];
> unsigned int num_rx_rings;
> u16 cp_cmd;
> u16 tx_lpi_timer;
> u32 irq_mask;
> + unsigned int hw_supp_num_rx_queues;
> unsigned int irq_nvecs;
> struct clk *clk;
>
> @@ -764,6 +782,7 @@ struct rtl8169_private {
> unsigned aspm_manageable:1;
> unsigned dash_enabled:1;
> bool sfp_mode:1;
> + bool recheck_desc_ownbit:1;
AI bots ask if this needs to be set for all chips or just some specific
version. Also, I think this workaround should be added in a dedicated
commit. For ease of review the introduction of struct rtl8169_rx_ring
should be a code-reshuffling type of commit, rather than a functional
change.
> dma_addr_t counters_phys_addr;
> struct rtl8169_counters *counters;
> struct rtl8169_tc_offsets tc_offset;
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [Patch net-next v6 1/7] r8169: add support for multi irqs
2026-05-29 1:00 ` Jakub Kicinski
@ 2026-05-29 5:43 ` Javen
2026-05-29 18:07 ` Jakub Kicinski
0 siblings, 1 reply; 18+ messages in thread
From: Javen @ 2026-05-29 5:43 UTC (permalink / raw)
To: Jakub Kicinski
Cc: hkallweit1@gmail.com, nic_swsd@realtek.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
>On Tue, 26 May 2026 16:11:11 +0800 javen wrote:
>> @@ -4820,7 +4838,7 @@ static int rtl_rx(struct net_device *dev, struct
>rtl8169_private *tp, int budget
>> goto release_descriptor;
>> }
>>
>> - skb = napi_alloc_skb(&tp->napi, pkt_size);
>> + skb = napi_alloc_skb(&tp->rtl8169_napi[0], pkt_size);
>
>the caller is the NAPI poll function, you should pass that NAPI as arg to rtl_rx()
>already instead of hardcoding [0] in this patch.
>
>> if (unlikely(!skb)) {
>> dev->stats.rx_dropped++;
>> goto release_descriptor; @@ -4844,7 +4862,7 @@
>> static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>> if (skb->pkt_type == PACKET_MULTICAST)
>> dev->stats.multicast++;
>>
>> - napi_gro_receive(&tp->napi, skb);
>> + napi_gro_receive(&tp->rtl8169_napi[0], skb);
>>
>> dev_sw_netstats_rx_add(dev, pkt_size);
>> release_descriptor:
>
>> +static int rtl8169_set_real_num_queues(struct rtl8169_private *tp) {
>> + int ret;
>> +
>> + ret = netif_set_real_num_tx_queues(tp->dev, 1);
>> + if (ret < 0)
>> + return ret;
>> +
>> + return netif_set_real_num_rx_queues(tp->dev, tp->num_rx_rings);
>
>netif_set_real_num_queues() exists, just call it directly instead of adding your
>own helper.
>
>> +}
>> +
>> static int rtl_jumbo_max(struct rtl8169_private *tp) {
>> /* Non-GBit versions don't support jumbo frames */ @@ -5599,6
>> +5669,22 @@ static bool rtl_aspm_is_safe(struct rtl8169_private *tp)
>> return false;
>> }
>>
>> +static void r8169_del_napi_action(void *data) {
>> + struct rtl8169_private *tp = data;
>> + int i;
>> +
>> + for (i = 0; i < tp->irq_nvecs; i++)
>> + netif_napi_del(&tp->rtl8169_napi[i]);
>> +}
>> +
>> +static void r8169_init_napi(struct rtl8169_private *tp) {
>> + for (int i = 0; i < tp->irq_nvecs; i++)
>> + netif_napi_add(tp->dev, &tp->rtl8169_napi[i], rtl8169_poll);
>> + devm_add_action_or_reset(&tp->pci_dev->dev,
>> +r8169_del_napi_action, tp);
>
>devm_add_action_or_reset() can fail (as the AI bots point out) but this whole
>devm_ dance is entirely unnecessary networking stack will automatically
>delete NAPI instances when device is unregistered.
Thanks for your review.
In patch v3, link: https://lore.kernel.org/netdev/20260513115543.1730-2-javen_xu@realsil.com.cn/
I tried to alloc struct rtl8169_napi dynamically for saving memory according to Heiner's suggestion. I agree with his suggestion because only 8127 rss are enabled.
And in this ai review, link: https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260520031603.700-1-javen_xu%40realsil.com.cn
AI suggested that the lifetime of this devm_kcalloc'd napi array may be compatible with the netdev's napi list. So I add devm_add_action_or_reset in patch v6.
I checked the code and agree that the stack auto-deletes NAPI instances in free_netdev() -> netdev_napi_exit(). However, because devres releases resources in LIFO order:
1. kfree for the NAPI array (allocated via devm_kcalloc) will be called first.
2. free_netdev() (registered via devm_alloc_etherdev) will be called second
When free_netdev() calls netdev_napi_exit() to iterate over dev->napi_list, the NAPI memory has already been freed by devm, which will cause a Use-After-Free. That's why I added the devm action to explicitly remove it before the memory is freed.
So I wanna know what should I do? Whether keep the action in this patch(dynamically allocate napi array) or patch v2(fix the array size), or any other suggestion will be apperaciated.
BRs,
Javen
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [Patch net-next v6 2/7] r8169: add support for multi rx queues
2026-05-29 1:04 ` Jakub Kicinski
@ 2026-05-29 6:47 ` Javen
2026-05-29 18:07 ` Jakub Kicinski
0 siblings, 1 reply; 18+ messages in thread
From: Javen @ 2026-05-29 6:47 UTC (permalink / raw)
To: Jakub Kicinski
Cc: hkallweit1@gmail.com, nic_swsd@realtek.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
>On Tue, 26 May 2026 16:11:12 +0800 javen wrote:
>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c
>> b/drivers/net/ethernet/realtek/r8169_main.c
>> index 22e843baffc7..62bf77aa1ec8 100644
>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>> @@ -74,9 +74,13 @@
>> #define NUM_TX_DESC 256 /* Number of Tx descriptor registers */
>> #define NUM_RX_DESC 256 /* Number of Rx descriptor registers */
>> #define R8169_TX_RING_BYTES (NUM_TX_DESC * sizeof(struct TxDesc))
>> -#define R8169_RX_RING_BYTES (NUM_RX_DESC * sizeof(struct RxDesc))
>> +#define R8169_RX_RING_BYTES ((NUM_RX_DESC + 1) * sizeof(struct
>> +RxDesc))
>
>AI bots are asking why the "+ 1"?
This + 1 is a workaround for the hardware DMA prefetcher. The H/W might aggressively fetch one more descriptor even after hitting the RingEnd mark. We allocated this extra dummy space as padding to prevent out-of-bounds access and potential IOMMU faults.
>
>> #define R8169_TX_STOP_THRS (MAX_SKB_FRAGS + 1)
>> #define R8169_TX_START_THRS (2 * R8169_TX_STOP_THRS)
>> +#define R8169_MAX_RX_QUEUES 8
>> +#define R8127_MAX_RX_QUEUES 8
>> +#define R8169_DEFAULT_RX_QUEUES 1
>> +#define R8169_MAX_TX_QUEUES 1
>>
>> #define OCP_STD_PHY_BASE 0xa400
>>
>> @@ -441,6 +445,7 @@ enum rtl8125_registers {
>> TxPoll_8125 = 0x90,
>> LEDSEL3 = 0x96,
>> MAC0_BKP = 0x19e0,
>> + RDSAR_Q1_LOW = 0x4000,
>> RSS_CTRL_8125 = 0x4500,
>> Q_NUM_CTRL_8125 = 0x4800,
>> EEE_TXIDLE_TIMER_8125 = 0x6048,
>> @@ -728,6 +733,21 @@ enum rtl_dash_type {
>> RTL_DASH_25_BP,
>> };
>>
>> +enum rx_desc_ring_type {
>> + RX_DESC_RING_TYPE_DEFAULT,
>> + RX_DESC_RING_TYPE_RSS,
>> +};
>> +
>> +struct rtl8169_rx_ring {
>> + u32 index; /* Rx queue index */
>> + u32 cur_rx; /* Index of next Rx pkt. */
>> + u32 dirty_rx; /* Index for recycling. */
>> + struct RxDesc *rx_desc_array; /* array of Rx Desc*/
>> + dma_addr_t rx_desc_phy_addr[NUM_RX_DESC]; /* Rx data buffer
>physical dma address */
>> + dma_addr_t rx_phy_addr; /* Rx desc physical address */
>> + struct page *rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
>> +};
>> +
>> struct rtl8169_private {
>> void __iomem *mmio_addr; /* memory map physical address */
>> struct pci_dev *pci_dev;
>> @@ -735,20 +755,18 @@ struct rtl8169_private {
>> struct phy_device *phydev;
>> enum mac_version mac_version;
>> enum rtl_dash_type dash_type;
>> - u32 cur_rx; /* Index into the Rx descriptor buffer of next Rx pkt. */
>> u32 cur_tx; /* Index into the Tx descriptor buffer of next Rx pkt. */
>> u32 dirty_tx;
>> struct TxDesc *TxDescArray; /* 256-aligned Tx descriptor ring */
>> - struct RxDesc *RxDescArray; /* 256-aligned Rx descriptor ring */
>> dma_addr_t TxPhyAddr;
>> - dma_addr_t RxPhyAddr;
>> - struct page *Rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
>> struct ring_info tx_skb[NUM_TX_DESC]; /* Tx data buffers */
>> struct napi_struct *rtl8169_napi;
>> + struct rtl8169_rx_ring rx_ring[R8169_MAX_RX_QUEUES];
>> unsigned int num_rx_rings;
>> u16 cp_cmd;
>> u16 tx_lpi_timer;
>> u32 irq_mask;
>> + unsigned int hw_supp_num_rx_queues;
>> unsigned int irq_nvecs;
>> struct clk *clk;
>>
>> @@ -764,6 +782,7 @@ struct rtl8169_private {
>> unsigned aspm_manageable:1;
>> unsigned dash_enabled:1;
>> bool sfp_mode:1;
>> + bool recheck_desc_ownbit:1;
>
>AI bots ask if this needs to be set for all chips or just some specific version.
>Also, I think this workaround should be added in a dedicated commit. For
>ease of review the introduction of struct rtl8169_rx_ring should be a code-
>reshuffling type of commit, rather than a functional change.
I will remove it from this patch and add it in a dedicated commit.
>
>> dma_addr_t counters_phys_addr;
>> struct rtl8169_counters *counters;
>> struct rtl8169_tc_offsets tc_offset;
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 1/7] r8169: add support for multi irqs
2026-05-29 5:43 ` Javen
@ 2026-05-29 18:07 ` Jakub Kicinski
0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-05-29 18:07 UTC (permalink / raw)
To: Javen
Cc: hkallweit1@gmail.com, nic_swsd@realtek.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
On Fri, 29 May 2026 05:43:52 +0000 Javen wrote:
> >devm_add_action_or_reset() can fail (as the AI bots point out) but this whole
> >devm_ dance is entirely unnecessary networking stack will automatically
> >delete NAPI instances when device is unregistered.
>
> Thanks for your review.
>
> In patch v3, link: https://lore.kernel.org/netdev/20260513115543.1730-2-javen_xu@realsil.com.cn/
> I tried to alloc struct rtl8169_napi dynamically for saving memory according to Heiner's suggestion. I agree with his suggestion because only 8127 rss are enabled.
> And in this ai review, link: https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260520031603.700-1-javen_xu%40realsil.com.cn
> AI suggested that the lifetime of this devm_kcalloc'd napi array may be compatible with the netdev's napi list. So I add devm_add_action_or_reset in patch v6.
> I checked the code and agree that the stack auto-deletes NAPI instances in free_netdev() -> netdev_napi_exit(). However, because devres releases resources in LIFO order:
> 1. kfree for the NAPI array (allocated via devm_kcalloc) will be called first.
> 2. free_netdev() (registered via devm_alloc_etherdev) will be called second
> When free_netdev() calls netdev_napi_exit() to iterate over dev->napi_list, the NAPI memory has already been freed by devm, which will cause a Use-After-Free. That's why I added the devm action to explicitly remove it before the memory is freed.
>
> So I wanna know what should I do? Whether keep the action in this
> patch(dynamically allocate napi array) or patch v2(fix the array
> size), or any other suggestion will be apperaciated.
Personal preference I guess. IMO it's pretty clear here that the devm_
help relatively little and they introduce a lot of complexity. You can
stop using them, or just handle the error on devm_add_action_or_reset().
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 2/7] r8169: add support for multi rx queues
2026-05-29 6:47 ` Javen
@ 2026-05-29 18:07 ` Jakub Kicinski
0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2026-05-29 18:07 UTC (permalink / raw)
To: Javen
Cc: hkallweit1@gmail.com, nic_swsd@realtek.com, andrew+netdev@lunn.ch,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
On Fri, 29 May 2026 06:47:00 +0000 Javen wrote:
> >> @@ -74,9 +74,13 @@
> >> #define NUM_TX_DESC 256 /* Number of Tx descriptor registers */
> >> #define NUM_RX_DESC 256 /* Number of Rx descriptor registers */
> >> #define R8169_TX_RING_BYTES (NUM_TX_DESC * sizeof(struct TxDesc))
> >> -#define R8169_RX_RING_BYTES (NUM_RX_DESC * sizeof(struct RxDesc))
> >> +#define R8169_RX_RING_BYTES ((NUM_RX_DESC + 1) * sizeof(struct
> >> +RxDesc))
> >
> >AI bots are asking why the "+ 1"?
>
> This + 1 is a workaround for the hardware DMA prefetcher. The H/W might aggressively fetch one more descriptor even after hitting the RingEnd mark. We allocated this extra dummy space as padding to prevent out-of-bounds access and potential IOMMU faults.
Add a brief comment explaining this please
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [Patch net-next v6 0/7] r8169: add RSS support for RTL8127
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
` (6 preceding siblings ...)
2026-05-26 8:11 ` [Patch net-next v6 7/7] r8169: support setting rx queue numbers via ethtool javen
@ 2026-06-01 2:14 ` Javen
7 siblings, 0 replies; 18+ messages in thread
From: Javen @ 2026-06-01 2:14 UTC (permalink / raw)
To: Javen, hkallweit1@gmail.com, nic_swsd@realtek.com,
andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, horms@kernel.org
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
>This patch series adds RSS (Receive Side Scaling) support for the r8169
>ethernet driver, specifically for RTL8127 (RTL_GIGA_MAC_VER_80).
>
>RSS enables packet distribution across multiple receive queues, which can
>significantly improve network throughput on multi-core systems by allowing
>parallel processing of incoming packets.
>
>Key features:
>- Multi-queue RX support (up to 8 queues)
>- MSI-X interrupt with vector mapping
>- Dynamic queue configuration via ethtool (-L)
>- RSS hash computation for flow classification
>
Hi, Heiner
A gentle ping on this v6 patch.
I have received some feedback from other reviewers (and AI bot) during this week, and I have already addressed those comments locally.
Before I send out the v7 patchset, I wanted to kindly check if you have any additional comments on this v6? Or would you prefer me to send out v7 directly with the current fixes so you can review the latest version?
Thanks for your time and help!
Best regards,
Javen
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 1/7] r8169: add support for multi irqs
2026-05-26 8:11 ` [Patch net-next v6 1/7] r8169: add support for multi irqs javen
2026-05-29 1:00 ` Jakub Kicinski
@ 2026-06-02 21:22 ` Heiner Kallweit
1 sibling, 0 replies; 18+ messages in thread
From: Heiner Kallweit @ 2026-06-02 21:22 UTC (permalink / raw)
To: javen, nic_swsd, andrew+netdev, davem, edumazet, kuba, pabeni,
horms
Cc: netdev, linux-kernel
On 26.05.2026 10:11, javen wrote:
> From: Javen Xu <javen_xu@realsil.com.cn>
>
> RSS uses multi rx queues to receive packets, and each rx queue needs one
> irq and napi. So this patch adds support for multi irqs and napi here.
> Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
> ---
> Changes in v2:
> - remove some unused definitions, such as index, name in rtl8169_irq
> - remove array imr and isr
> - remove min_irq_nvecs and max_irq_nvecs, replaced with help function
> get_min_irq_nvecs and get_max_irq_nvecs
> - alloc irq by flags, instead of PCI_IRQ_ALL_TYPES
>
> Changes in v3:
> - add enum rtl_isr_version to replace macro definition
> - remove struct rtl8169_napi, use napi_struct array instead and alloc
> memory for this array dynamically
> - remove struct rtl8169_irq
>
> Changes in v4:
> - change retval to ret in rtl8169_set_real_num_queue()
> - reverse xmas tree in rtl8169_poll() and rtl8169_interrupt()
> - remove tp->hw_supp_isr_ver
>
> Changes in v5:
> - rtl8169_request_irq(), when failed, only free irqs which are
> allocated
> - remove rss_support, simplied napi init, call r8169_init_napi()
> directly
> - remove rtl_isr_version, INTR_VEC_MAP_MASK, INTR_VEC_MAP_STATUS,
> R8169_MAX_MSIX_VEC, rss_enable, recheck_desc_ownbit
> - rtl_software_parameter_initialize() this function will be expanded in
> next patch, so i want to remain it here.
>
> Changes in v6:
> - Fix netpoll crash
> - Fix use-after-free during driver unload by registering a devm action
> for netif_napi_del()
> - remove tp->irq
> ---
> drivers/net/ethernet/realtek/r8169_main.c | 144 ++++++++++++++++++----
> 1 file changed, 120 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index ec4fc21fa21f..22e843baffc7 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -733,7 +733,6 @@ struct rtl8169_private {
> struct pci_dev *pci_dev;
> struct net_device *dev;
> struct phy_device *phydev;
> - struct napi_struct napi;
> enum mac_version mac_version;
> enum rtl_dash_type dash_type;
> u32 cur_rx; /* Index into the Rx descriptor buffer of next Rx pkt. */
> @@ -745,10 +744,12 @@ struct rtl8169_private {
> dma_addr_t RxPhyAddr;
> struct page *Rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
> struct ring_info tx_skb[NUM_TX_DESC]; /* Tx data buffers */
> + struct napi_struct *rtl8169_napi;
> + unsigned int num_rx_rings;
> u16 cp_cmd;
> u16 tx_lpi_timer;
> u32 irq_mask;
> - int irq;
> + unsigned int irq_nvecs;
> struct clk *clk;
>
> struct {
> @@ -2680,6 +2681,11 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
> rtl_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
> }
>
> +static void rtl_software_parameter_initialize(struct rtl8169_private *tp)
"software parameter" is too generic IMO. Can the function name better describe
which kind of parameters is initialized here?
> +{
> + tp->num_rx_rings = 1;
> +}
> +
> static void rtl_request_firmware(struct rtl8169_private *tp)
> {
> struct rtl_fw *rtl_fw;
> @@ -4266,9 +4272,21 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
> netdev_reset_queue(tp->dev);
> }
>
> +static void rtl8169_napi_disable(struct rtl8169_private *tp)
> +{
> + for (int i = 0; i < tp->irq_nvecs; i++)
> + napi_disable(&tp->rtl8169_napi[i]);
> +}
> +
> +static void rtl8169_napi_enable(struct rtl8169_private *tp)
> +{
> + for (int i = 0; i < tp->irq_nvecs; i++)
> + napi_enable(&tp->rtl8169_napi[i]);
> +}
> +
> static void rtl8169_cleanup(struct rtl8169_private *tp)
> {
> - napi_disable(&tp->napi);
> + rtl8169_napi_disable(tp);
>
> /* Give a racing hard_start_xmit a few cycles to complete. */
> synchronize_net();
> @@ -4314,7 +4332,7 @@ static void rtl_reset_work(struct rtl8169_private *tp)
> for (i = 0; i < NUM_RX_DESC; i++)
> rtl8169_mark_to_asic(tp->RxDescArray + i);
>
> - napi_enable(&tp->napi);
> + rtl8169_napi_enable(tp);
> rtl_hw_start(tp);
> }
>
> @@ -4820,7 +4838,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
> goto release_descriptor;
> }
>
> - skb = napi_alloc_skb(&tp->napi, pkt_size);
> + skb = napi_alloc_skb(&tp->rtl8169_napi[0], pkt_size);
> if (unlikely(!skb)) {
> dev->stats.rx_dropped++;
> goto release_descriptor;
> @@ -4844,7 +4862,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
> if (skb->pkt_type == PACKET_MULTICAST)
> dev->stats.multicast++;
>
> - napi_gro_receive(&tp->napi, skb);
> + napi_gro_receive(&tp->rtl8169_napi[0], skb);
>
> dev_sw_netstats_rx_add(dev, pkt_size);
> release_descriptor:
> @@ -4856,8 +4874,12 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp, int budget
>
> static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
> {
> - struct rtl8169_private *tp = dev_instance;
> - u32 status = rtl_get_events(tp);
> + struct napi_struct *napi = dev_instance;
> + struct rtl8169_private *tp;
> + u32 status;
> +
> + tp = netdev_priv(napi->dev);
> + status = rtl_get_events(tp);
>
> if ((status & 0xffff) == 0xffff || !(status & tp->irq_mask))
> return IRQ_NONE;
> @@ -4873,13 +4895,43 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
> phy_mac_interrupt(tp->phydev);
>
> rtl_irq_disable(tp);
> - napi_schedule(&tp->napi);
> + napi_schedule(napi);
> out:
> rtl_ack_events(tp, status);
>
> return IRQ_HANDLED;
> }
>
> +static void rtl8169_free_irq(struct rtl8169_private *tp)
> +{
> + for (int i = 0; i < tp->irq_nvecs; i++) {
> + struct napi_struct *napi = &tp->rtl8169_napi[i];
> +
> + pci_free_irq(tp->pci_dev, i, napi);
> + }
> +}
> +
> +static int rtl8169_request_irq(struct rtl8169_private *tp)
> +{
> + struct net_device *dev = tp->dev;
> + struct napi_struct *napi;
> + int i, rc;
> +
> + for (i = 0; i < tp->irq_nvecs; i++) {
> + napi = &tp->rtl8169_napi[i];
> + rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt,
> + NULL, napi, "%s-%d", dev->name, i);
> + if (rc)
> + goto free_irq;
> + }
> + return 0;
> +
> +free_irq:
> + while (--i >= 0)
> + pci_free_irq(tp->pci_dev, i, &tp->rtl8169_napi[i]);
> + return rc;
> +}
> +
> static void rtl_task(struct work_struct *work)
> {
> struct rtl8169_private *tp =
> @@ -4914,9 +4966,9 @@ static void rtl_task(struct work_struct *work)
>
> static int rtl8169_poll(struct napi_struct *napi, int budget)
> {
> - struct rtl8169_private *tp = container_of(napi, struct rtl8169_private, napi);
> - struct net_device *dev = tp->dev;
> - int work_done;
> + struct rtl8169_private *tp = netdev_priv(napi->dev);
> + struct net_device *dev = napi->dev;
> + int work_done = 0;
>
> rtl_tx(dev, tp, budget);
>
> @@ -5035,7 +5087,7 @@ static void rtl8169_up(struct rtl8169_private *tp)
> phy_init_hw(tp->phydev);
> phy_resume(tp->phydev);
> rtl8169_init_phy(tp);
> - napi_enable(&tp->napi);
> + rtl8169_napi_enable(tp);
> enable_work(&tp->wk.work);
> rtl_reset_work(tp);
>
> @@ -5053,7 +5105,7 @@ static int rtl8169_close(struct net_device *dev)
> rtl8169_down(tp);
> rtl8169_rx_clear(tp);
>
> - free_irq(tp->irq, tp);
> + rtl8169_free_irq(tp);
>
> phy_disconnect(tp->phydev);
>
> @@ -5074,7 +5126,7 @@ static void rtl8169_netpoll(struct net_device *dev)
> {
> struct rtl8169_private *tp = netdev_priv(dev);
>
> - rtl8169_interrupt(tp->irq, tp);
> + rtl8169_interrupt(pci_irq_vector(tp->pci_dev, 0), &tp->rtl8169_napi[0]);
> }
> #endif
>
> @@ -5082,7 +5134,6 @@ static int rtl_open(struct net_device *dev)
> {
> struct rtl8169_private *tp = netdev_priv(dev);
> struct pci_dev *pdev = tp->pci_dev;
> - unsigned long irqflags;
> int retval = -ENOMEM;
>
> pm_runtime_get_sync(&pdev->dev);
> @@ -5107,8 +5158,7 @@ static int rtl_open(struct net_device *dev)
>
> rtl_request_firmware(tp);
>
> - irqflags = pci_dev_msi_enabled(pdev) ? IRQF_NO_THREAD : IRQF_SHARED;
> - retval = request_irq(tp->irq, rtl8169_interrupt, irqflags, dev->name, tp);
> + retval = rtl8169_request_irq(tp);
> if (retval < 0)
> goto err_release_fw_2;
>
> @@ -5125,7 +5175,7 @@ static int rtl_open(struct net_device *dev)
> return retval;
>
> err_free_irq:
> - free_irq(tp->irq, tp);
> + rtl8169_free_irq(tp);
> err_release_fw_2:
> rtl_release_firmware(tp);
> rtl8169_rx_clear(tp);
> @@ -5328,7 +5378,9 @@ static void rtl_set_irq_mask(struct rtl8169_private *tp)
>
> static int rtl_alloc_irq(struct rtl8169_private *tp)
> {
> + struct pci_dev *pdev = tp->pci_dev;
> unsigned int flags;
> + int nvecs;
>
> switch (tp->mac_version) {
> case RTL_GIGA_MAC_VER_02 ... RTL_GIGA_MAC_VER_06:
> @@ -5344,7 +5396,14 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
> break;
> }
>
> - return pci_alloc_irq_vectors(tp->pci_dev, 1, 1, flags);
> + nvecs = pci_alloc_irq_vectors(pdev, 1, 1, flags);
> +
> + if (nvecs < 0)
> + return nvecs;
> +
> + tp->irq_nvecs = nvecs;
> +
> + return 0;
> }
>
> static void rtl_read_mac_address(struct rtl8169_private *tp,
> @@ -5539,6 +5598,17 @@ static void rtl_hw_initialize(struct rtl8169_private *tp)
> }
> }
>
> +static int rtl8169_set_real_num_queues(struct rtl8169_private *tp)
> +{
> + int ret;
> +
> + ret = netif_set_real_num_tx_queues(tp->dev, 1);
> + if (ret < 0)
> + return ret;
> +
> + return netif_set_real_num_rx_queues(tp->dev, tp->num_rx_rings);
> +}
> +
> static int rtl_jumbo_max(struct rtl8169_private *tp)
> {
> /* Non-GBit versions don't support jumbo frames */
> @@ -5599,6 +5669,22 @@ static bool rtl_aspm_is_safe(struct rtl8169_private *tp)
> return false;
> }
>
> +static void r8169_del_napi_action(void *data)
> +{
> + struct rtl8169_private *tp = data;
> + int i;
> +
> + for (i = 0; i < tp->irq_nvecs; i++)
> + netif_napi_del(&tp->rtl8169_napi[i]);
> +}
> +
> +static void r8169_init_napi(struct rtl8169_private *tp)
> +{
> + for (int i = 0; i < tp->irq_nvecs; i++)
> + netif_napi_add(tp->dev, &tp->rtl8169_napi[i], rtl8169_poll);
> + devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
> +}
> +
> static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> {
> const struct rtl_chip_info *chip;
> @@ -5703,11 +5789,16 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> rtl_hw_reset(tp);
>
> + rtl_software_parameter_initialize(tp);
> +
> rc = rtl_alloc_irq(tp);
> if (rc < 0)
> return dev_err_probe(&pdev->dev, rc, "Can't allocate interrupt\n");
>
> - tp->irq = pci_irq_vector(pdev, 0);
> + tp->rtl8169_napi = devm_kcalloc(&pdev->dev, tp->irq_nvecs,
> + sizeof(struct napi_struct), GFP_KERNEL);
> + if (!tp->rtl8169_napi)
> + return -ENOMEM;
>
> INIT_WORK(&tp->wk.work, rtl_task);
> disable_work(&tp->wk.work);
> @@ -5716,7 +5807,7 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
>
> dev->ethtool_ops = &rtl8169_ethtool_ops;
>
> - netif_napi_add(dev, &tp->napi, rtl8169_poll);
> + r8169_init_napi(tp);
>
> dev->hw_features = NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
> NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX;
> @@ -5778,6 +5869,10 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> if (jumbo_max)
> dev->max_mtu = jumbo_max;
>
> + rc = rtl8169_set_real_num_queues(tp);
> + if (rc < 0)
> + return dev_err_probe(&pdev->dev, rc, "set tx/rx num failure\n");
> +
> rtl_set_irq_mask(tp);
>
> tp->counters = dmam_alloc_coherent (&pdev->dev, sizeof(*tp->counters),
> @@ -5803,8 +5898,9 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> tp->leds = rtl8168_init_leds(dev);
> }
>
> - netdev_info(dev, "%s, %pM, %sXID %x, IRQ %d\n",
> - chip->name, dev->dev_addr, ext_xid_str, xid, tp->irq);
> + netdev_info(dev, "%s, %pM, %sXID %x, IRQ %d (%d total)\n",
> + chip->name, dev->dev_addr, ext_xid_str, xid,
> + pci_irq_vector(pdev, 0), tp->irq_nvecs);
>
> if (jumbo_max)
> netdev_info(dev, "jumbo features [frames: %d bytes, tx checksumming: %s]\n",
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 3/7] r8169: add support for new interrupt mapping
2026-05-26 8:11 ` [Patch net-next v6 3/7] r8169: add support for new interrupt mapping javen
@ 2026-06-02 21:23 ` Heiner Kallweit
0 siblings, 0 replies; 18+ messages in thread
From: Heiner Kallweit @ 2026-06-02 21:23 UTC (permalink / raw)
To: javen, nic_swsd, andrew+netdev, davem, edumazet, kuba, pabeni,
horms
Cc: netdev, linux-kernel
On 26.05.2026 10:11, javen wrote:
> From: Javen Xu <javen_xu@realsil.com.cn>
>
> To support RSS, the number of hardware interrupt bits should match the
> interrupt of software. So we add support for new interrupt mapping here.
> ISR_VER_MAP_REG is the hardware register to indicate interrupt status.
> IMR_SET_VEC_MAP_REG is interrupt mask which is set to enable irq.
>
> Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
> ---
> Changes in v2:
> - no changes
>
> Changes in v3:
> - init index in napi_struct and get message_id from index
> - move rtl8169_disable_hw_interrupt_msix directly before the call to
> napi_schedule()
> - change the condition in rtl8169_request_irq when RTL_VEC_MAP_ENABLE
> enabled, use rtl8169_interrupt_msix
>
> Changes in v4:
> - remove flag tp->feature, replace tp->features & RTL_VEC_MAP_ENABLE
> with tp->irq_nvecs > 1, they are equivalent.
> - follow reverse xmas tree, in rtl8169_interrupt_msix(),
> rtl8169_poll_msix_rx(), rtl8169_poll_msix_tx(),
> rtl8169_poll_msix_other()
> - use napi->index in rtl8169_poll_msix_other()
> - add a comment to describe RTL8127 MSI-X vector layout
> - simplify r8169_init_napi()
>
> Changes in v5:
> - replace magic number in rtl8169_poll_msix_tx()
>
> Changes in v6:
> - when irq_nvecs <= 1, use register IntrMask_8125, else using vec map
> - fix irq sequence in rtl8169_interrupt_msix(), disable interrupts
> before clean it
> - remove dead code in rtl8169_poll_msix_tx()
> ---
> drivers/net/ethernet/realtek/r8169_main.c | 166 +++++++++++++++++++---
> 1 file changed, 150 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 62bf77aa1ec8..951d2046a81b 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -79,6 +79,7 @@
> #define R8169_TX_START_THRS (2 * R8169_TX_STOP_THRS)
> #define R8169_MAX_RX_QUEUES 8
> #define R8127_MAX_RX_QUEUES 8
> +#define R8127_MAX_TX_QUEUES 8
> #define R8169_DEFAULT_RX_QUEUES 1
> #define R8169_MAX_TX_QUEUES 1
>
> @@ -449,8 +450,12 @@ enum rtl8125_registers {
> RSS_CTRL_8125 = 0x4500,
> Q_NUM_CTRL_8125 = 0x4800,
> EEE_TXIDLE_TIMER_8125 = 0x6048,
> + IMR_CLEAR_VEC_MAP_REG = 0x0d00,
> + ISR_VEC_MAP_REG = 0x0d04,
> + IMR_SET_VEC_MAP_REG = 0x0d0c,
> };
>
> +#define MSIX_ID_VEC_MAP_LINKCHG 29
> #define LEDSEL_MASK_8125 0x23f
>
> #define RX_VLAN_INNER_8125 BIT(22)
> @@ -581,6 +586,9 @@ enum rtl_register_content {
>
> /* magic enable v2 */
> MagicPacket_v2 = (1 << 16), /* Wake up when receives a Magic Packet */
> +#define ISRIMR_LINKCHG BIT(29)
> +#define ISRIMR_TOK_Q0 BIT(8)
> +#define ISRIMR_ROK_Q0 BIT(0)
> };
>
> enum rtl_desc_bit {
> @@ -1664,26 +1672,38 @@ static u32 rtl_get_events(struct rtl8169_private *tp)
>
> static void rtl_ack_events(struct rtl8169_private *tp, u32 bits)
> {
> - if (rtl_is_8125(tp))
> - RTL_W32(tp, IntrStatus_8125, bits);
> - else
> + if (rtl_is_8125(tp)) {
> + if (tp->irq_nvecs > 1)
> + RTL_W32(tp, ISR_VEC_MAP_REG, bits);
> + else
> + RTL_W32(tp, IntrStatus_8125, bits);
> + } else {
> RTL_W16(tp, IntrStatus, bits);
> + }
> }
>
> static void rtl_irq_disable(struct rtl8169_private *tp)
> {
> - if (rtl_is_8125(tp))
> - RTL_W32(tp, IntrMask_8125, 0);
> - else
> + if (rtl_is_8125(tp)) {
> + if (tp->irq_nvecs > 1)
> + RTL_W32(tp, IMR_CLEAR_VEC_MAP_REG, 0xffffffff);
> + else
> + RTL_W32(tp, IntrMask_8125, 0);
> + } else {
> RTL_W16(tp, IntrMask, 0);
> + }
> }
>
> static void rtl_irq_enable(struct rtl8169_private *tp)
> {
> - if (rtl_is_8125(tp))
> - RTL_W32(tp, IntrMask_8125, tp->irq_mask);
> - else
> + if (rtl_is_8125(tp)) {
> + if (tp->irq_nvecs > 1)
> + RTL_W32(tp, IMR_SET_VEC_MAP_REG, tp->irq_mask);
> + else
> + RTL_W32(tp, IntrMask_8125, tp->irq_mask);
> + } else {
> RTL_W16(tp, IntrMask, tp->irq_mask);
> + }
> }
>
> static void rtl8169_irq_mask_and_ack(struct rtl8169_private *tp)
> @@ -5062,6 +5082,45 @@ static void rtl8169_free_irq(struct rtl8169_private *tp)
> }
> }
>
> +static void rtl8169_disable_hw_interrupt_msix(struct rtl8169_private *tp, int message_id)
> +{
> + RTL_W32(tp, IMR_CLEAR_VEC_MAP_REG, BIT(message_id));
> +}
> +
> +static void rtl8169_clear_hw_isr(struct rtl8169_private *tp, int message_id)
> +{
> + RTL_W32(tp, ISR_VEC_MAP_REG, BIT(message_id));
> +}
> +
> +static void rtl8169_enable_hw_interrupt_msix(struct rtl8169_private *tp, int message_id)
> +{
> + RTL_W32(tp, IMR_SET_VEC_MAP_REG, BIT(message_id));
> +}
> +
> +static irqreturn_t rtl8169_interrupt_msix(int irq, void *dev_instance)
> +{
> + struct napi_struct *napi = dev_instance;
> + struct net_device *dev = napi->dev;
> + int message_id = napi->index;
> + struct rtl8169_private *tp;
> +
> + tp = netdev_priv(dev);
> +
> + if (message_id == MSIX_ID_VEC_MAP_LINKCHG) {
> + rtl8169_clear_hw_isr(tp, message_id);
> + phy_mac_interrupt(tp->phydev);
> + return IRQ_HANDLED;
> + }
> +
> + rtl8169_disable_hw_interrupt_msix(tp, message_id);
> + rtl8169_clear_hw_isr(tp, message_id);
> +
> + tp->recheck_desc_ownbit = true;
> + napi_schedule(napi);
> +
> + return IRQ_HANDLED;
> +}
> +
> static int rtl8169_request_irq(struct rtl8169_private *tp)
> {
> struct net_device *dev = tp->dev;
> @@ -5070,8 +5129,12 @@ static int rtl8169_request_irq(struct rtl8169_private *tp)
>
> for (i = 0; i < tp->irq_nvecs; i++) {
> napi = &tp->rtl8169_napi[i];
> - rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt,
> - NULL, napi, "%s-%d", dev->name, i);
> + if (tp->irq_nvecs > 1)
> + rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt_msix,
> + NULL, napi, "%s-%d", dev->name, i);
> + else
> + rc = pci_request_irq(tp->pci_dev, i, rtl8169_interrupt,
> + NULL, napi, "%s-%d", dev->name, i);
> if (rc)
> goto free_irq;
> }
> @@ -5517,10 +5580,16 @@ static const struct net_device_ops rtl_netdev_ops = {
>
> static void rtl_set_irq_mask(struct rtl8169_private *tp)
> {
> - tp->irq_mask = RxOK | RxErr | TxOK | TxErr | LinkChg;
> + if (tp->irq_nvecs > 1) {
> + tp->irq_mask = ISRIMR_LINKCHG | ISRIMR_TOK_Q0;
> + for (int i = 0; i < tp->num_rx_rings; i++)
> + tp->irq_mask |= ISRIMR_ROK_Q0 << i;
> + } else {
> + tp->irq_mask = RxOK | RxErr | TxOK | TxErr | LinkChg;
>
> - if (tp->mac_version <= RTL_GIGA_MAC_VER_06)
> - tp->irq_mask |= SYSErr | RxFIFOOver;
> + if (tp->mac_version <= RTL_GIGA_MAC_VER_06)
> + tp->irq_mask |= SYSErr | RxFIFOOver;
> + }
> }
>
> static int rtl_alloc_irq(struct rtl8169_private *tp)
> @@ -5825,10 +5894,75 @@ static void r8169_del_napi_action(void *data)
> netif_napi_del(&tp->rtl8169_napi[i]);
> }
>
> +static int rtl8169_poll_msix_rx(struct napi_struct *napi, int budget)
> +{
> + struct net_device *dev = napi->dev;
> + const int message_id = napi->index;
> + struct rtl8169_private *tp;
> + int work_done = 0;
> +
> + tp = netdev_priv(dev);
> +
> + if (message_id < tp->num_rx_rings)
> + work_done += rtl_rx(dev, tp, &tp->rx_ring[message_id], budget);
> +
> + if (work_done < budget && napi_complete_done(napi, work_done))
> + rtl8169_enable_hw_interrupt_msix(tp, message_id);
> +
> + return work_done;
> +}
> +
> +static int rtl8169_poll_msix_tx(struct napi_struct *napi, int budget)
> +{
> + struct net_device *dev = napi->dev;
> + const int message_id = napi->index;
> + struct rtl8169_private *tp;
> +
> + tp = netdev_priv(dev);
> +
> + rtl_tx(dev, tp, budget);
> +
> + if (napi_complete_done(napi, 0))
> + rtl8169_enable_hw_interrupt_msix(tp, message_id);
> +
> + return 0;
> +}
> +
> +static int rtl8169_poll_msix_other(struct napi_struct *napi, int budget)
> +{
> + struct net_device *dev = napi->dev;
> + const int message_id = napi->index;
This variable is used only once, I don't really see a benefit.
> + struct rtl8169_private *tp;
> +
> + tp = netdev_priv(dev);
> +
> + if (napi_complete_done(napi, 0))
> + rtl8169_enable_hw_interrupt_msix(tp, message_id);
> +
> + return 0;
> +}
> +
> +/* RTL8127 MSI-X vector layout:
> + * Vectors 0 .. (RxQs - 1) : Rx Queues
> + * Vectors RxQs .. (RxQs + TxQs - 1) : Tx Queues
> + * Vector (RxQs + TxQs) and up : Other events (Link status(29), etc.)
> + */
> static void r8169_init_napi(struct rtl8169_private *tp)
> {
> - for (int i = 0; i < tp->irq_nvecs; i++)
> - netif_napi_add(tp->dev, &tp->rtl8169_napi[i], rtl8169_poll);
> + for (int i = 0; i < tp->irq_nvecs; i++) {
> + int (*poll_fn)(struct napi_struct *, int) = rtl8169_poll;
> +
> + if (tp->irq_nvecs > 1) {
> + if (i < R8127_MAX_RX_QUEUES)
> + poll_fn = rtl8169_poll_msix_rx;
> + else if (i < R8127_MAX_RX_QUEUES + R8127_MAX_TX_QUEUES)
> + poll_fn = rtl8169_poll_msix_tx;
> + else
> + poll_fn = rtl8169_poll_msix_other;
> + }
> + netif_napi_add(tp->dev, &tp->rtl8169_napi[i], poll_fn);
> + tp->rtl8169_napi[i].index = i;
> + }
> devm_add_action_or_reset(&tp->pci_dev->dev, r8169_del_napi_action, tp);
> }
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Patch net-next v6 5/7] r8169: add support and enable rss
2026-05-26 8:11 ` [Patch net-next v6 5/7] r8169: add support and enable rss javen
@ 2026-06-02 21:23 ` Heiner Kallweit
0 siblings, 0 replies; 18+ messages in thread
From: Heiner Kallweit @ 2026-06-02 21:23 UTC (permalink / raw)
To: javen, nic_swsd, andrew+netdev, davem, edumazet, kuba, pabeni,
horms
Cc: netdev, linux-kernel
On 26.05.2026 10:11, javen wrote:
> From: Javen Xu <javen_xu@realsil.com.cn>
>
> This patch adds support and enable rss for RTL8127.
>
> Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
> ---
> Changes in v2:
> - some changes moved from Patch 2/7
>
> Changes in v3:
> - add struct rtl8169_rss_data. Allocate it dynamically when needed.
> - define rss_key as an u32 array
> - replace some magic bit numbers in rtl8169_set_rss_hash_opt() and
> rtl8125_set_rx_q_num()
> - use union to combine different rx descriptor, refactor struct RxDesc
> - remove dead code from rtl8169_double_check_rss_support()
>
> Changes in v4:
> - rename macro definition, e.g R8127_MAX_IRQ to R8127_MAX_NUM_IRQVEC
> - change hw_supp_indir_tbl_entries type to unsigned int
> - change init_rx_desc_type type to enum
> - remove rtl_check_rss_support(), add helper function
> rtl_hw_support_rss()
> - remove hw_curr_isr_ver, use irq_nvecs to judge whether we should
> enable vector interrupt mapping, use tp->num_rx_ring to judge whether
> we should enable rss
> - remove function rtl8169_double_check_rss_support(), use
> rtl8169_set_rx_ring_num() to set num_rx_ring according to tp->irq_nvecs
>
> Changes in v5:
> - no changes
>
> Changes in v6:
> - change rss_queue_num type from u8 to unsigned int
> - fix rx desc clear in rtl8169_rx_clear() for different desc type
> - clamping num_rx_ring with rounddown_pow_of_two()
> ---
> drivers/net/ethernet/realtek/r8169_main.c | 397 +++++++++++++++++++---
> 1 file changed, 358 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 13d955324037..a79a8756516d 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -82,6 +82,19 @@
> #define R8127_MAX_TX_QUEUES 8
> #define R8169_DEFAULT_RX_QUEUES 1
> #define R8169_MAX_TX_QUEUES 1
> +#define R8127_MAX_NUM_IRQVEC 32
> +#define R8127_MIN_NUM_IRQVEC 30
> +#define R8169_IRQ_DEFAULT 1
> +#define RTL_RSS_KEY_SIZE 40
> +#define RSS_CPU_NUM_MASK GENMASK(18, 16)
> +#define RSS_HASH_MASK GENMASK(10, 8)
> +#define RTL_MAX_INDIRECTION_TABLE_ENTRIES 128
> +#define RXS_RSS_UDP BIT(27)
> +#define RXS_RSS_IPV4 BIT(28)
> +#define RXS_RSS_IPV6 BIT(29)
> +#define RXS_RSS_TCP BIT(30)
> +#define RXS_RSS_L3_TYPE_MASK (RXS_RSS_IPV4 | RXS_RSS_IPV6)
> +#define RXS_RSS_L4_TYPE_MASK (RXS_RSS_TCP | RXS_RSS_UDP)
>
> #define OCP_STD_PHY_BASE 0xa400
>
> @@ -589,6 +602,25 @@ enum rtl_register_content {
> #define ISRIMR_LINKCHG BIT(29)
> #define ISRIMR_TOK_Q0 BIT(8)
> #define ISRIMR_ROK_Q0 BIT(0)
> +#define RTL_DESC_TYPE_CTRL 0xd8
> +#define RSS_KEY_REG 0x4600
> +#define RSS_INDIRECTION_TBL_REG 0x4700
> +#define RSS_CTRL_TCP_IPV4_SUPP BIT(0)
> +#define RTL_DESC_TYPE_RSS BIT(1)
> +#define RSS_CTRL_IPV4_SUPP BIT(1)
> +#define RSS_CTRL_TCP_IPV6_SUPP BIT(2)
> +#define RSS_CTRL_IPV6_SUPP BIT(3)
> +#define RSS_CTRL_IPV6_EXT_SUPP BIT(4)
> +#define RSS_CTRL_TCP_IPV6_EXT_SUPP BIT(5)
> +#define RSS_CTRL_UDP_IPV4_SUPP BIT(6)
> +#define RSS_CTRL_UDP_IPV6_SUPP BIT(7)
> +#define RSS_CTRL_UDP_IPV6_EXT_SUPP BIT(8)
> +#define RTL_RSS_FLAG_HASH_UDP_IPV4 BIT(0)
> +#define RTL_RSS_FLAG_HASH_UDP_IPV6 BIT(1)
> +#define RX_RES_RSS BIT(22)
> +#define RX_RUNT_RSS BIT(21)
> +#define RX_CRC_RSS BIT(20)
> +#define RTL_RX_Q_NUM_MASK GENMASK(4, 2)
> };
>
> enum rtl_desc_bit {
> @@ -646,6 +678,11 @@ enum rtl_rx_desc_bit {
> #define RxProtoIP (PID1 | PID0)
> #define RxProtoMask RxProtoIP
>
> +#define RX_UDPT_DESC_RSS BIT(19)
> +#define RX_TCPT_DESC_RSS BIT(18)
> +#define RX_UDPF_DESC_RSS BIT(16) /* UDP/IP checksum failed */
> +#define RX_TCPF_DESC_RSS BIT(15) /* TCP/IP checksum failed */
> +
> IPFail = (1 << 16), /* IP checksum failed */
> UDPFail = (1 << 15), /* UDP/IP checksum failed */
> TCPFail = (1 << 14), /* TCP/IP checksum failed */
> @@ -667,9 +704,27 @@ struct TxDesc {
> };
>
> struct RxDesc {
> - __le32 opts1;
> - __le32 opts2;
> - __le64 addr;
> + union {
> + /* RX_DESC_TYPE_DEFAULT */
> + struct {
> + __le32 opts1;
> + __le32 opts2;
> + __le64 addr;
> + };
> +
> + /* RX_DESC_TYPE_RSS */
> + struct {
> + union {
> + __le64 rss_addr;
> + struct {
> + __le32 rss_info;
> + __le32 rss_result;
> + } rss_dword;
> + };
> + __le32 rss_opts2;
> + __le32 rss_opts1;
> + };
> + };
> };
>
> struct ring_info {
> @@ -741,9 +796,9 @@ enum rtl_dash_type {
> RTL_DASH_25_BP,
> };
>
> -enum rx_desc_ring_type {
> - RX_DESC_RING_TYPE_DEFAULT,
> - RX_DESC_RING_TYPE_RSS,
> +enum rx_desc_type {
> + RX_DESC_TYPE_DEFAULT,
> + RX_DESC_TYPE_RSS,
> };
>
> struct rtl8169_rx_ring {
> @@ -756,6 +811,12 @@ struct rtl8169_rx_ring {
> struct page *rx_databuff[NUM_RX_DESC]; /* Rx data buffers */
> };
>
> +struct rtl8169_rss_data {
> + u32 rss_key[RTL_RSS_KEY_SIZE / sizeof(u32)];
> + u8 rss_indir_tbl[RTL_MAX_INDIRECTION_TABLE_ENTRIES];
> + unsigned int hw_supp_indir_tbl_entries;
> +};
> +
> struct rtl8169_private {
> void __iomem *mmio_addr; /* memory map physical address */
> struct pci_dev *pci_dev;
> @@ -775,7 +836,9 @@ struct rtl8169_private {
> u16 tx_lpi_timer;
> u32 irq_mask;
> unsigned int hw_supp_num_rx_queues;
> + struct rtl8169_rss_data *rss_data;
> unsigned int irq_nvecs;
> + enum rx_desc_type init_rx_desc_type;
> struct clk *clk;
>
> struct {
> @@ -1606,6 +1669,11 @@ static bool rtl_dash_is_enabled(struct rtl8169_private *tp)
> }
> }
>
> +static bool rtl_hw_support_rss(struct rtl8169_private *tp)
> +{
> + return tp->mac_version == RTL_GIGA_MAC_VER_80;
> +}
> +
> static enum rtl_dash_type rtl_get_dash_type(struct rtl8169_private *tp)
> {
> switch (tp->mac_version) {
> @@ -1907,9 +1975,20 @@ static inline u32 rtl8169_tx_vlan_tag(struct sk_buff *skb)
> TxVlanTag | swab16(skb_vlan_tag_get(skb)) : 0x00;
> }
>
> -static void rtl8169_rx_vlan_tag(struct RxDesc *desc, struct sk_buff *skb)
> +static void rtl8169_rx_vlan_tag(struct rtl8169_private *tp,
> + struct RxDesc *desc,
> + struct sk_buff *skb)
> {
> - u32 opts2 = le32_to_cpu(desc->opts2);
> + u32 opts2;
> +
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + opts2 = le32_to_cpu(desc->rss_opts2);
> + break;
> + default:
> + opts2 = le32_to_cpu(desc->opts2);
> + break;
> + }
>
> if (opts2 & RxVlanTag)
> __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), swab16(opts2 & 0xffff));
> @@ -2738,17 +2817,27 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
> rtl_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
> }
>
> +static void rtl8169_init_rss(struct rtl8169_private *tp)
> +{
> + for (int i = 0; i < tp->rss_data->hw_supp_indir_tbl_entries; i++)
> + tp->rss_data->rss_indir_tbl[i] = ethtool_rxfh_indir_default(i, tp->num_rx_rings);
> +
> + netdev_rss_key_fill(tp->rss_data->rss_key, RTL_RSS_KEY_SIZE);
> +}
> +
> static void rtl_software_parameter_initialize(struct rtl8169_private *tp)
> {
> tp->num_rx_rings = 1;
> switch (tp->mac_version) {
> case RTL_GIGA_MAC_VER_80:
> tp->hw_supp_num_rx_queues = R8127_MAX_RX_QUEUES;
> + tp->rss_data->hw_supp_indir_tbl_entries = RTL_MAX_INDIRECTION_TABLE_ENTRIES;
> break;
> default:
> tp->hw_supp_num_rx_queues = R8169_DEFAULT_RX_QUEUES;
> break;
> }
> + tp->init_rx_desc_type = RX_DESC_TYPE_DEFAULT;
> }
>
> static void rtl_request_firmware(struct rtl8169_private *tp)
> @@ -2873,6 +2962,64 @@ static void rtl_set_rx_max_size(struct rtl8169_private *tp)
> RTL_W16(tp, RxMaxSize, R8169_RX_BUF_SIZE + 1);
> }
>
> +static void rtl8169_store_rss_key(struct rtl8169_private *tp)
> +{
> + u32 num_entries = RTL_RSS_KEY_SIZE / sizeof(u32);
> + u32 *rss_key = tp->rss_data->rss_key;
> + const u16 rss_key_reg = RSS_KEY_REG;
> +
> + /* Write redirection table to HW */
> + for (int i = 0; i < num_entries; i++)
> + RTL_W32(tp, rss_key_reg + (i * 4), rss_key[i]);
> +}
> +
> +static void rtl8169_store_reta(struct rtl8169_private *tp)
> +{
> + u32 i, reta_entries = tp->rss_data->hw_supp_indir_tbl_entries;
> + u16 indir_tbl_reg = RSS_INDIRECTION_TBL_REG;
Do these variables really require fixed-width types?
And why variable indir_tbl_reg at all?
> + u8 *indir_tbl = tp->rss_data->rss_indir_tbl;
> + u32 reta = 0;
> +
> + /* Write redirection table to HW */
> + for (i = 0; i < reta_entries; i++) {
> + reta |= indir_tbl[i] << (i & 0x3) * 8;
> + if ((i & 3) == 3) {
> + RTL_W32(tp, indir_tbl_reg, reta);
> + indir_tbl_reg += 4;
> + reta = 0;
> + }
> + }
> +}
> +
> +static int rtl8169_set_rss_hash_opt(struct rtl8169_private *tp)
> +{
> + u32 rss_ctrl;
> +
> + rss_ctrl = FIELD_PREP(RSS_CPU_NUM_MASK, ilog2(tp->num_rx_rings));
> +
> + /* Perform hash on these packet types */
> + rss_ctrl |= RSS_CTRL_TCP_IPV4_SUPP
> + | RSS_CTRL_IPV4_SUPP
> + | RSS_CTRL_IPV6_SUPP
> + | RSS_CTRL_IPV6_EXT_SUPP
> + | RSS_CTRL_TCP_IPV6_SUPP
> + | RSS_CTRL_TCP_IPV6_EXT_SUPP;
> +
> + rss_ctrl |= FIELD_PREP(RSS_HASH_MASK,
> + ilog2(tp->rss_data->hw_supp_indir_tbl_entries));
> +
> + RTL_W32(tp, RSS_CTRL_8125, rss_ctrl);
> +
> + return 0;
> +}
> +
> +static void rtl_set_rss_config(struct rtl8169_private *tp)
> +{
> + rtl8169_set_rss_hash_opt(tp);
> + rtl8169_store_reta(tp);
> + rtl8169_store_rss_key(tp);
> +}
> +
> static void rtl_set_rx_tx_desc_registers(struct rtl8169_private *tp)
> {
> struct rtl8169_rx_ring *ring = &tp->rx_ring[0];
> @@ -3939,6 +4086,18 @@ DECLARE_RTL_COND(rtl_mac_ocp_e00e_cond)
> return r8168_mac_ocp_read(tp, 0xe00e) & BIT(13);
> }
>
> +static void rtl8125_set_rx_q_num(struct rtl8169_private *tp)
> +{
> + u16 rx_q_num;
> + u16 q_ctrl;
> +
> + rx_q_num = (u16)ilog2(tp->num_rx_rings);
Is the cast needed here?
> + q_ctrl = RTL_R16(tp, Q_NUM_CTRL_8125);
> + q_ctrl &= ~RTL_RX_Q_NUM_MASK;
> + q_ctrl |= FIELD_PREP(RTL_RX_Q_NUM_MASK, rx_q_num);
> + RTL_W16(tp, Q_NUM_CTRL_8125, q_ctrl);
> +}
> +
> static void rtl8169_hw_enable_vec_mapping(struct rtl8169_private *tp)
> {
> u8 tmp;
> @@ -3978,6 +4137,13 @@ static void rtl_hw_start_8125_common(struct rtl8169_private *tp)
> tp->mac_version == RTL_GIGA_MAC_VER_80)
> RTL_W8(tp, 0xD8, RTL_R8(tp, 0xD8) & ~0x02);
>
> + /* enable rx descriptor type v4 and set queue num for rss*/
> + if (tp->num_rx_rings > 1) {
> + rtl8125_set_rx_q_num(tp);
> + RTL_W8(tp, RTL_DESC_TYPE_CTRL,
> + RTL_R8(tp, RTL_DESC_TYPE_CTRL) | RTL_DESC_TYPE_RSS);
> + }
> +
> if (tp->mac_version == RTL_GIGA_MAC_VER_80)
> r8168_mac_ocp_modify(tp, 0xe614, 0x0f00, 0x0f00);
> else if (tp->mac_version == RTL_GIGA_MAC_VER_70)
> @@ -4214,6 +4380,12 @@ static void rtl_hw_start(struct rtl8169_private *tp)
> rtl_hw_aspm_clkreq_enable(tp, true);
> rtl_set_rx_max_size(tp);
> rtl_set_rx_tx_desc_registers(tp);
> + if (rtl_is_8125(tp)) {
> + if (tp->num_rx_rings > 1)
> + rtl_set_rss_config(tp);
> + else
> + RTL_W32(tp, RSS_CTRL_8125, 0x00);
> + }
> rtl_lock_config_regs(tp);
>
> rtl_jumbo_config(tp);
> @@ -4241,14 +4413,26 @@ static int rtl8169_change_mtu(struct net_device *dev, int new_mtu)
> return 0;
> }
>
> -static void rtl8169_mark_to_asic(struct RxDesc *desc)
> +static void rtl8169_mark_to_asic(struct rtl8169_private *tp, struct RxDesc *desc)
> {
> - u32 eor = le32_to_cpu(desc->opts1) & RingEnd;
> + u32 eor;
>
> - desc->opts2 = 0;
> - /* Force memory writes to complete before releasing descriptor */
> - dma_wmb();
> - WRITE_ONCE(desc->opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + eor = le32_to_cpu(desc->rss_opts1) & RingEnd;
> + desc->rss_opts2 = cpu_to_le32(0);
> + /* Force memory writes to complete before releasing descriptor */
> + dma_wmb();
> + WRITE_ONCE(desc->rss_opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
> + break;
> + default:
> + eor = le32_to_cpu(desc->opts1) & RingEnd;
> + desc->opts2 = cpu_to_le32(0);
> + /* Force memory writes to complete before releasing descriptor */
> + dma_wmb();
> + WRITE_ONCE(desc->opts1, cpu_to_le32(DescOwn | eor | R8169_RX_BUF_SIZE));
> + break;
> + }
> }
>
> static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
> @@ -4271,9 +4455,12 @@ static struct page *rtl8169_alloc_rx_data(struct rtl8169_private *tp,
> return NULL;
> }
>
> - desc->addr = cpu_to_le64(mapping);
> ring->rx_desc_phy_addr[index] = mapping;
> - rtl8169_mark_to_asic(desc);
> + if (tp->init_rx_desc_type == RX_DESC_TYPE_RSS)
> + desc->rss_addr = cpu_to_le64(mapping);
> + else
> + desc->addr = cpu_to_le64(mapping);
> + rtl8169_mark_to_asic(tp, desc);
>
> return data;
> }
> @@ -4289,8 +4476,25 @@ static void rtl8169_rx_clear(struct rtl8169_private *tp, struct rtl8169_rx_ring
> __free_pages(ring->rx_databuff[i], get_order(R8169_RX_BUF_SIZE));
> ring->rx_databuff[i] = NULL;
> ring->rx_desc_phy_addr[i] = 0;
> - ring->rx_desc_array[i].addr = 0;
> - ring->rx_desc_array[i].opts1 = 0;
> + if (tp->init_rx_desc_type == RX_DESC_TYPE_RSS) {
> + ring->rx_desc_array[i].rss_addr = 0;
> + ring->rx_desc_array[i].rss_opts1 = 0;
> + } else {
> + ring->rx_desc_array[i].addr = 0;
> + ring->rx_desc_array[i].opts1 = 0;
> + }
> + }
> +}
> +
> +static void rtl8169_mark_as_last_descriptor(struct rtl8169_private *tp, struct RxDesc *desc)
> +{
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + desc->rss_opts1 |= cpu_to_le32(RingEnd);
> + break;
> + default:
> + desc->opts1 |= cpu_to_le32(RingEnd);
> + break;
> }
> }
>
> @@ -4310,7 +4514,7 @@ static int rtl8169_rx_fill(struct rtl8169_private *tp, struct rtl8169_rx_ring *r
> }
>
> /* mark as last descriptor in the ring */
> - ring->rx_desc_array[NUM_RX_DESC - 1].opts1 |= cpu_to_le32(RingEnd);
> + rtl8169_mark_as_last_descriptor(tp, &ring->rx_desc_array[NUM_RX_DESC - 1]);
>
> return 0;
> }
> @@ -4466,7 +4670,7 @@ static void rtl8169_rx_desc_reset(struct rtl8169_private *tp)
> struct rtl8169_rx_ring *ring = &tp->rx_ring[i];
>
> for (int j = 0; j < NUM_RX_DESC; j++)
> - rtl8169_mark_to_asic(ring->rx_desc_array + j);
> + rtl8169_mark_to_asic(tp, ring->rx_desc_array + j);
> }
> }
>
> @@ -4922,35 +5126,104 @@ static inline int rtl8169_fragmented_frame(u32 status)
> return (status & (FirstFrag | LastFrag)) != (FirstFrag | LastFrag);
> }
>
> -static inline void rtl8169_rx_csum(struct sk_buff *skb,
> +static inline void rtl8169_rx_hash(struct rtl8169_private *tp,
> + struct RxDesc *desc,
> + struct sk_buff *skb)
> +{
> + u32 rss_header_info;
> + u32 hash_val;
> +
> + if (!(tp->dev->features & NETIF_F_RXHASH))
> + return;
> +
> + rss_header_info = le32_to_cpu(desc->rss_dword.rss_info);
> +
> + if (!(rss_header_info & RXS_RSS_L3_TYPE_MASK))
> + return;
> +
> + hash_val = le32_to_cpu(desc->rss_dword.rss_result);
> +
> + skb_set_hash(skb, hash_val,
> + (RXS_RSS_L4_TYPE_MASK & rss_header_info) ?
> + PKT_HASH_TYPE_L4 : PKT_HASH_TYPE_L3);
> +}
> +
> +static inline void rtl8169_rx_csum(struct rtl8169_private *tp,
> + struct sk_buff *skb,
> struct RxDesc *desc)
> {
> - u32 status = le32_to_cpu(desc->opts1) & (RxProtoMask | RxCSFailMask);
> + bool csum_ok = false;
> + u32 opts1;
>
> - if (status == RxProtoTCP || status == RxProtoUDP)
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + opts1 = le32_to_cpu(desc->rss_opts1);
> + if (((opts1 & RX_TCPT_DESC_RSS) && !(opts1 & RX_TCPF_DESC_RSS)) ||
> + ((opts1 & RX_UDPT_DESC_RSS) && !(opts1 & RX_UDPF_DESC_RSS)))
> + csum_ok = true;
> + break;
> + default:
> + opts1 = le32_to_cpu(desc->opts1) & (RxProtoMask | RxCSFailMask);
> + if (opts1 == RxProtoTCP || opts1 == RxProtoUDP)
> + csum_ok = true;
> + break;
> + }
> +
> + if (csum_ok)
> skb->ip_summed = CHECKSUM_UNNECESSARY;
> else
> skb_checksum_none_assert(skb);
> }
>
> +static __le32 rtl8169_rx_desc_opts1(struct rtl8169_private *tp, struct RxDesc *desc)
> +{
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + return READ_ONCE(desc->rss_opts1);
> + default:
> + return READ_ONCE(desc->opts1);
> + }
> +}
> +
> static bool rtl8169_check_rx_desc_error(struct net_device *dev,
> struct rtl8169_private *tp,
> u32 status)
> {
> - if (unlikely(status & RxRES)) {
> - if (status & (RxRWT | RxRUNT))
> - dev->stats.rx_length_errors++;
> - if (status & RxCRC)
> - dev->stats.rx_crc_errors++;
> - return true;
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + if (unlikely(status & RX_RES_RSS)) {
> + if (status & RX_RUNT_RSS)
> + dev->stats.rx_length_errors++;
> + if (status & RX_CRC_RSS)
> + dev->stats.rx_crc_errors++;
> + return true;
> + }
> + break;
> + default:
> + if (unlikely(status & RxRES)) {
> + if (status & (RxRWT | RxRUNT))
> + dev->stats.rx_length_errors++;
> + if (status & RxCRC)
> + dev->stats.rx_crc_errors++;
> + return true;
> + }
> + break;
> }
> return false;
> }
>
> -static void rtl8169_set_desc_dma_addr(struct RxDesc *desc,
> +static void rtl8169_set_desc_dma_addr(struct rtl8169_private *tp,
> + struct RxDesc *desc,
> dma_addr_t mapping)
> {
> - desc->addr = cpu_to_le64(mapping);
> + switch (tp->init_rx_desc_type) {
> + case RX_DESC_TYPE_RSS:
> + desc->rss_addr = cpu_to_le64(mapping);
> + break;
> + default:
> + desc->addr = cpu_to_le64(mapping);
> + break;
> + }
> }
>
> static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
> @@ -4967,7 +5240,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
> dma_addr_t addr;
> u32 status;
>
> - status = le32_to_cpu(READ_ONCE(desc->opts1));
> + status = le32_to_cpu(rtl8169_rx_desc_opts1(tp, desc));
>
> if (status & DescOwn) {
> if (!tp->recheck_desc_ownbit)
> @@ -4983,7 +5256,7 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
> */
> tp->recheck_desc_ownbit = false;
> RTL_R8(tp, LED_CTRL);
> - status = le32_to_cpu(READ_ONCE(desc->opts1));
> + status = le32_to_cpu(rtl8169_rx_desc_opts1(tp, desc));
> if (status & DescOwn)
> break;
> }
> @@ -5034,11 +5307,12 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
> skb->tail += pkt_size;
> skb->len = pkt_size;
> dma_sync_single_for_device(d, addr, pkt_size, DMA_FROM_DEVICE);
> -
> - rtl8169_rx_csum(skb, desc);
> + if (tp->num_rx_rings > 1)
> + rtl8169_rx_hash(tp, desc, skb);
> + rtl8169_rx_csum(tp, skb, desc);
> skb->protocol = eth_type_trans(skb, dev);
>
> - rtl8169_rx_vlan_tag(desc, skb);
> + rtl8169_rx_vlan_tag(tp, desc, skb);
>
> if (skb->pkt_type == PACKET_MULTICAST)
> dev->stats.multicast++;
> @@ -5047,8 +5321,8 @@ static int rtl_rx(struct net_device *dev, struct rtl8169_private *tp,
>
> dev_sw_netstats_rx_add(dev, pkt_size);
> release_descriptor:
> - rtl8169_set_desc_dma_addr(desc, ring->rx_desc_phy_addr[entry]);
> - rtl8169_mark_to_asic(desc);
> + rtl8169_set_desc_dma_addr(tp, desc, ring->rx_desc_phy_addr[entry]);
> + rtl8169_mark_to_asic(tp, desc);
> }
>
> return count;
> @@ -5604,6 +5878,32 @@ static void rtl_set_irq_mask(struct rtl8169_private *tp)
> }
> }
>
> +static int get_max_irq_nvecs(struct rtl8169_private *tp)
> +{
> + if (tp->mac_version == RTL_GIGA_MAC_VER_80)
> + return R8127_MAX_NUM_IRQVEC;
> + return R8169_IRQ_DEFAULT;
> +}
> +
> +static int get_min_irq_nvecs(struct rtl8169_private *tp)
> +{
> + if (tp->mac_version == RTL_GIGA_MAC_VER_80)
> + return R8127_MIN_NUM_IRQVEC;
> + return R8169_IRQ_DEFAULT;
> +}
> +
> +static void rtl8169_set_rx_ring_num(struct rtl8169_private *tp)
> +{
> + if (tp->irq_nvecs >= get_min_irq_nvecs(tp)) {
> + unsigned int rss_queue_num = netif_get_num_default_rss_queues();
> +
> + tp->num_rx_rings = rounddown_pow_of_two(min(rss_queue_num,
> + tp->hw_supp_num_rx_queues));
> + if (tp->num_rx_rings >= 2)
> + tp->init_rx_desc_type = RX_DESC_TYPE_RSS;
> + }
> +}
> +
> static int rtl_alloc_irq(struct rtl8169_private *tp)
> {
> struct pci_dev *pdev = tp->pci_dev;
> @@ -5624,7 +5924,10 @@ static int rtl_alloc_irq(struct rtl8169_private *tp)
> break;
> }
>
> - nvecs = pci_alloc_irq_vectors(pdev, 1, 1, flags);
> + nvecs = pci_alloc_irq_vectors(pdev, get_min_irq_nvecs(tp), get_max_irq_nvecs(tp), flags);
> +
> + if (nvecs < 0)
> + nvecs = pci_alloc_irq_vectors(pdev, 1, 1, flags);
>
> if (nvecs < 0)
> return nvecs;
> @@ -6071,6 +6374,12 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> tp->dash_type = rtl_get_dash_type(tp);
> tp->dash_enabled = rtl_dash_is_enabled(tp);
>
> + if (rtl_hw_support_rss(tp)) {
> + tp->rss_data = devm_kzalloc(&pdev->dev, sizeof(*tp->rss_data), GFP_KERNEL);
> + if (!tp->rss_data)
> + return -ENOMEM;
> + }
> +
> tp->cp_cmd = RTL_R16(tp, CPlusCmd) & CPCMD_MASK;
>
> if (sizeof(dma_addr_t) > 4 && tp->mac_version >= RTL_GIGA_MAC_VER_18 &&
> @@ -6096,6 +6405,11 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> if (!tp->rtl8169_napi)
> return -ENOMEM;
>
> + rtl8169_set_rx_ring_num(tp);
> +
> + if (rtl_hw_support_rss(tp))
> + rtl8169_init_rss(tp);
> +
> INIT_WORK(&tp->wk.work, rtl_task);
> disable_work(&tp->wk.work);
>
> @@ -6110,6 +6424,11 @@ static int rtl_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
> dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
> dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
>
> + if (rtl_hw_support_rss(tp) && tp->num_rx_rings > 1) {
> + dev->hw_features |= NETIF_F_RXHASH;
> + dev->features |= NETIF_F_RXHASH;
> + }
> +
> /*
> * Pretend we are using VLANs; This bypasses a nasty bug where
> * Interrupts stop flowing on high load on 8110SCd controllers.
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2026-06-02 21:23 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-26 8:11 [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 javen
2026-05-26 8:11 ` [Patch net-next v6 1/7] r8169: add support for multi irqs javen
2026-05-29 1:00 ` Jakub Kicinski
2026-05-29 5:43 ` Javen
2026-05-29 18:07 ` Jakub Kicinski
2026-06-02 21:22 ` Heiner Kallweit
2026-05-26 8:11 ` [Patch net-next v6 2/7] r8169: add support for multi rx queues javen
2026-05-29 1:04 ` Jakub Kicinski
2026-05-29 6:47 ` Javen
2026-05-29 18:07 ` Jakub Kicinski
2026-05-26 8:11 ` [Patch net-next v6 3/7] r8169: add support for new interrupt mapping javen
2026-06-02 21:23 ` Heiner Kallweit
2026-05-26 8:11 ` [Patch net-next v6 4/7] r8169: enable " javen
2026-05-26 8:11 ` [Patch net-next v6 5/7] r8169: add support and enable rss javen
2026-06-02 21:23 ` Heiner Kallweit
2026-05-26 8:11 ` [Patch net-next v6 6/7] r8169: move struct ethtool_ops javen
2026-05-26 8:11 ` [Patch net-next v6 7/7] r8169: support setting rx queue numbers via ethtool javen
2026-06-01 2:14 ` [Patch net-next v6 0/7] r8169: add RSS support for RTL8127 Javen
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.