* [PATCH net-next 0/4] net: make dev_watchdog() less intrusive
@ 2021-11-17 3:29 Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 1/4] net: use an atomic_long_t for queue->trans_timeout Eric Dumazet
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-11-17 3:29 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
dev_watchdog() is used on many NIC to periodically monitor TX queues
to detect hangs.
Problem is : It stops all queues, then check them, then 'unfreeze' them.
Not only this stops feeding the NIC, it also migrates all qdiscs
to be serviced on the cpu calling netif_tx_unlock(), causing
a potential latency artifact.
With many TX queues, this is becoming more visible.
Eric Dumazet (4):
net: use an atomic_long_t for queue->trans_timeout
net: annotate accesses to queue->trans_start
net: do not inline netif_tx_lock()/netif_tx_unlock()
net: no longer stop all TX queues in dev_watchdog()
.../net/ethernet/apm/xgene/xgene_enet_main.c | 2 +-
drivers/net/ethernet/atheros/ag71xx.c | 2 +-
.../net/ethernet/freescale/dpaa/dpaa_eth.c | 4 +-
.../net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
drivers/net/ethernet/intel/igb/igb_main.c | 4 +-
.../mellanox/mlx5/core/en/reporter_tx.c | 2 +-
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +-
drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +-
drivers/net/virtio_net.c | 2 +-
drivers/net/wireless/marvell/mwifiex/init.c | 2 +-
drivers/staging/rtl8192e/rtllib_softmac.c | 2 +-
include/linux/netdevice.h | 57 +++++----------
net/core/net-sysfs.c | 6 +-
net/sched/sch_generic.c | 69 ++++++++++++++++---
15 files changed, 94 insertions(+), 70 deletions(-)
--
2.34.0.rc1.387.gb447b232ab-goog
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net-next 1/4] net: use an atomic_long_t for queue->trans_timeout
2021-11-17 3:29 [PATCH net-next 0/4] net: make dev_watchdog() less intrusive Eric Dumazet
@ 2021-11-17 3:29 ` Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 2/4] net: annotate accesses to queue->trans_start Eric Dumazet
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-11-17 3:29 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski
Cc: netdev, Eric Dumazet, Eric Dumazet, david decotigny
From: Eric Dumazet <edumazet@google.com>
tx_timeout_show() assumed dev_watchdog() would stop all
the queues, to fetch queue->trans_timeout under protection
of the queue->_xmit_lock.
As we want to no longer disrupt transmits, we use an
atomic_long_t instead.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: david decotigny <david.decotigny@google.com>
---
include/linux/netdevice.h | 2 +-
net/core/net-sysfs.c | 6 +-----
net/sched/sch_generic.c | 2 +-
3 files changed, 3 insertions(+), 7 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 31a7e6b2768123021690a3dc8572c5e8cb0e0027..143ac02c7f1cc90cf6704574fb0012e1ba830c70 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -592,7 +592,7 @@ struct netdev_queue {
* Number of TX timeouts for this queue
* (/sys/class/net/DEV/Q/trans_timeout)
*/
- unsigned long trans_timeout;
+ atomic_long_t trans_timeout;
/* Subordinate device that the queue has been assigned to */
struct net_device *sb_dev;
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 9c01c642cf9ef384fe54e56243b102ef838d0a62..addbef5419fbb62ce83f5132ae21c9d2872e95f5 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -1201,11 +1201,7 @@ static const struct sysfs_ops netdev_queue_sysfs_ops = {
static ssize_t tx_timeout_show(struct netdev_queue *queue, char *buf)
{
- unsigned long trans_timeout;
-
- spin_lock_irq(&queue->_xmit_lock);
- trans_timeout = queue->trans_timeout;
- spin_unlock_irq(&queue->_xmit_lock);
+ unsigned long trans_timeout = atomic_long_read(&queue->trans_timeout);
return sprintf(buf, fmt_ulong, trans_timeout);
}
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 3b0f620958037eb46e395f172c2315fdd98be914..1b4328bd495d54d44a9d51b53c8e8bc18b9cc294 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -467,7 +467,7 @@ static void dev_watchdog(struct timer_list *t)
time_after(jiffies, (trans_start +
dev->watchdog_timeo))) {
some_queue_timedout = 1;
- txq->trans_timeout++;
+ atomic_long_inc(&txq->trans_timeout);
break;
}
}
--
2.34.0.rc1.387.gb447b232ab-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next 2/4] net: annotate accesses to queue->trans_start
2021-11-17 3:29 [PATCH net-next 0/4] net: make dev_watchdog() less intrusive Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 1/4] net: use an atomic_long_t for queue->trans_timeout Eric Dumazet
@ 2021-11-17 3:29 ` Eric Dumazet
2021-11-18 18:27 ` kernel test robot
2021-11-17 3:29 ` [PATCH net-next 3/4] net: do not inline netif_tx_lock()/netif_tx_unlock() Eric Dumazet
` (2 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2021-11-17 3:29 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
In following patches, dev_watchdog() will no longer stop all queues.
It will read queue->trans_start locklessly.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
drivers/net/ethernet/apm/xgene/xgene_enet_main.c | 2 +-
drivers/net/ethernet/atheros/ag71xx.c | 2 +-
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 4 ++--
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +-
drivers/net/ethernet/ibm/ibmvnic.c | 2 +-
drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
.../ethernet/mellanox/mlx5/core/en/reporter_tx.c | 2 +-
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 6 +++---
drivers/net/ethernet/ti/am65-cpsw-nuss.c | 2 +-
drivers/net/virtio_net.c | 2 +-
drivers/net/wireless/marvell/mwifiex/init.c | 2 +-
drivers/staging/rtl8192e/rtllib_softmac.c | 2 +-
include/linux/netdevice.h | 16 +++++++++++++---
net/sched/sch_generic.c | 8 ++++----
14 files changed, 33 insertions(+), 23 deletions(-)
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 220dc42af31ae1200ca05441000bb2a1abd0fd89..ff2d099aab218b30783266d5f905e3f0846f0951 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -869,7 +869,7 @@ static void xgene_enet_timeout(struct net_device *ndev, unsigned int txqueue)
for (i = 0; i < pdata->txq_cnt; i++) {
txq = netdev_get_tx_queue(ndev, i);
- txq->trans_start = jiffies;
+ txq_trans_cond_update(txq);
netif_tx_start_queue(txq);
}
}
diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c
index 88d2ab7483994a850efcea1228d9718e0f6bc2ae..e4f30bb7498fec02742376a37a22495cba38a9ea 100644
--- a/drivers/net/ethernet/atheros/ag71xx.c
+++ b/drivers/net/ethernet/atheros/ag71xx.c
@@ -766,7 +766,7 @@ static bool ag71xx_check_dma_stuck(struct ag71xx *ag)
unsigned long timestamp;
u32 rx_sm, tx_sm, rx_fd;
- timestamp = netdev_get_tx_queue(ag->ndev, 0)->trans_start;
+ timestamp = READ_ONCE(netdev_get_tx_queue(ag->ndev, 0)->trans_start);
if (likely(time_before(jiffies, timestamp + HZ / 10)))
return false;
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 6b2927d863e2cc7569dadd4cd1e974dcc347274e..d6871437d9515df18c7819817799f9ade8bcb57e 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2325,7 +2325,7 @@ dpaa_start_xmit(struct sk_buff *skb, struct net_device *net_dev)
txq = netdev_get_tx_queue(net_dev, queue_mapping);
/* LLTX requires to do our own update of trans_start */
- txq->trans_start = jiffies;
+ txq_trans_cond_update(txq);
if (priv->tx_tstamp && skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
fd.cmd |= cpu_to_be32(FM_FD_CMD_UPD);
@@ -2531,7 +2531,7 @@ static int dpaa_xdp_xmit_frame(struct net_device *net_dev,
/* Bump the trans_start */
txq = netdev_get_tx_queue(net_dev, smp_processor_id());
- txq->trans_start = jiffies;
+ txq_trans_cond_update(txq);
err = dpaa_xmit(priv, percpu_stats, smp_processor_id(), &fd);
if (err) {
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
index 13835a37b3a2fd13d557750fb3943172fc2eb700..d5100179f8d589dc4ea517f5bafca0612a5fcc38 100644
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c
@@ -2679,7 +2679,7 @@ static bool hns3_get_tx_timeo_queue_info(struct net_device *ndev)
unsigned long trans_start;
q = netdev_get_tx_queue(ndev, i);
- trans_start = q->trans_start;
+ trans_start = READ_ONCE(q->trans_start);
if (netif_xmit_stopped(q) &&
time_after(jiffies,
(trans_start + ndev->watchdog_timeo))) {
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 3cca51735421a7435f4c7f32fa3f5af9003f2d37..c327fc8860da20e16b6801adc071bfb90dd05c36 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2058,7 +2058,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
tx_packets++;
tx_bytes += skb->len;
- txq->trans_start = jiffies;
+ txq_trans_cond_update(txq);
ret = NETDEV_TX_OK;
goto out;
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 836be0d3b29105d48530e2ce6b3f8db13c730e71..18a019a47182218ff85a83a00e75c99005e22a34 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -2927,7 +2927,7 @@ static int igb_xdp_xmit_back(struct igb_adapter *adapter, struct xdp_buff *xdp)
nq = txring_txq(tx_ring);
__netif_tx_lock(nq, cpu);
/* Avoid transmit queue timeout since we share it with the slow path */
- nq->trans_start = jiffies;
+ txq_trans_cond_update(nq);
ret = igb_xmit_xdp_ring(adapter, tx_ring, xdpf);
__netif_tx_unlock(nq);
@@ -2961,7 +2961,7 @@ static int igb_xdp_xmit(struct net_device *dev, int n,
__netif_tx_lock(nq, cpu);
/* Avoid transmit queue timeout since we share it with the slow path */
- nq->trans_start = jiffies;
+ txq_trans_cond_update(nq);
for (i = 0; i < n; i++) {
struct xdp_frame *xdpf = frames[i];
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 4f4bc8726ec4fa2b20a32edb25780ad177f6860a..86060513328739ecd6702525a6902bd82c7d2db6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -565,7 +565,7 @@ int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq)
snprintf(err_str, sizeof(err_str),
"TX timeout on queue: %d, SQ: 0x%x, CQ: 0x%x, SQ Cons: 0x%x SQ Prod: 0x%x, usecs since last trans: %u",
sq->ch_ix, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,
- jiffies_to_usecs(jiffies - sq->txq->trans_start));
+ jiffies_to_usecs(jiffies - READ_ONCE(sq->txq->trans_start)));
mlx5e_health_report(priv, priv->tx_reporter, err_str, &err_ctx);
return to_ctx.status;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 033c35c09a54876eeb87e30aad5ec8e8613f13b9..389d125310c151e54b428d19616fd07531051f54 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2356,7 +2356,7 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget)
bool work_done = true;
/* Avoids TX time-out as we are sharing with slow path */
- nq->trans_start = jiffies;
+ txq_trans_cond_update(nq->trans_start);
budget = min(budget, stmmac_tx_avail(priv, queue));
@@ -4657,7 +4657,7 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
__netif_tx_lock(nq, cpu);
/* Avoids TX time-out as we are sharing with slow path */
- nq->trans_start = jiffies;
+ txq_trans_cond_update(nq->trans_start);
res = stmmac_xdp_xmit_xdpf(priv, queue, xdpf, false);
if (res == STMMAC_XDP_TX)
@@ -6293,7 +6293,7 @@ static int stmmac_xdp_xmit(struct net_device *dev, int num_frames,
__netif_tx_lock(nq, cpu);
/* Avoids TX time-out as we are sharing with slow path */
- nq->trans_start = jiffies;
+ txq_trans_cond_update(nq);
for (i = 0; i < num_frames; i++) {
int res;
diff --git a/drivers/net/ethernet/ti/am65-cpsw-nuss.c b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
index c092cb61416a180a5ce1d0d28bd163e4a1dab302..750cea23e9cd02bba139a58553c4b1753956ad10 100644
--- a/drivers/net/ethernet/ti/am65-cpsw-nuss.c
+++ b/drivers/net/ethernet/ti/am65-cpsw-nuss.c
@@ -345,7 +345,7 @@ static void am65_cpsw_nuss_ndo_host_tx_timeout(struct net_device *ndev,
netif_txq = netdev_get_tx_queue(ndev, txqueue);
tx_chn = &common->tx_chns[txqueue];
- trans_start = netif_txq->trans_start;
+ trans_start = READ_ONCE(netif_txq->trans_start);
netdev_err(ndev, "txq:%d DRV_XOFF:%d tmo:%u dql_avail:%d free_desc:%zu\n",
txqueue,
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1771d6e5224fd834a4dfca4ba578134439d4d201..03e38e38ee4b5a97567eb692cd84a55722a1a8b2 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2694,7 +2694,7 @@ static void virtnet_tx_timeout(struct net_device *dev, unsigned int txqueue)
netdev_err(dev, "TX timeout on queue: %u, sq: %s, vq: 0x%x, name: %s, %u usecs ago\n",
txqueue, sq->name, sq->vq->index, sq->vq->name,
- jiffies_to_usecs(jiffies - txq->trans_start));
+ jiffies_to_usecs(jiffies - READ_ONCE(txq->trans_start)));
}
static const struct net_device_ops virtnet_netdev = {
diff --git a/drivers/net/wireless/marvell/mwifiex/init.c b/drivers/net/wireless/marvell/mwifiex/init.c
index f006a3d72b4046f435739e9218e3be6bf7001adc..88c72d1827a00d608e90e03beb20202228cd8699 100644
--- a/drivers/net/wireless/marvell/mwifiex/init.c
+++ b/drivers/net/wireless/marvell/mwifiex/init.c
@@ -332,7 +332,7 @@ void mwifiex_set_trans_start(struct net_device *dev)
int i;
for (i = 0; i < dev->num_tx_queues; i++)
- netdev_get_tx_queue(dev, i)->trans_start = jiffies;
+ txq_trans_cond_update(netdev_get_tx_queue(dev, i));
netif_trans_update(dev);
}
diff --git a/drivers/staging/rtl8192e/rtllib_softmac.c b/drivers/staging/rtl8192e/rtllib_softmac.c
index d2726d01c7573fe8230e0a7e0f7f811c1ff8cffc..aabbea48223d2f7915285c883e2ae94111bd91b6 100644
--- a/drivers/staging/rtl8192e/rtllib_softmac.c
+++ b/drivers/staging/rtl8192e/rtllib_softmac.c
@@ -2515,7 +2515,7 @@ void rtllib_stop_all_queues(struct rtllib_device *ieee)
unsigned int i;
for (i = 0; i < ieee->dev->num_tx_queues; i++)
- netdev_get_tx_queue(ieee->dev, i)->trans_start = jiffies;
+ txq_trans_cond_update(netdev_get_tx_queue(ieee->dev, i));
netif_tx_stop_all_queues(ieee->dev);
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 143ac02c7f1cc90cf6704574fb0012e1ba830c70..83e6204c0ba3491b56eec5c7f94e55eab7159223 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4095,10 +4095,21 @@ static inline void __netif_tx_unlock_bh(struct netdev_queue *txq)
spin_unlock_bh(&txq->_xmit_lock);
}
+/*
+ * txq->trans_start can be read locklessly from dev_watchdog()
+ */
static inline void txq_trans_update(struct netdev_queue *txq)
{
if (txq->xmit_lock_owner != -1)
- txq->trans_start = jiffies;
+ WRITE_ONCE(txq->trans_start, jiffies);
+}
+
+static inline void txq_trans_cond_update(struct netdev_queue *txq)
+{
+ unsigned long now = jiffies;
+
+ if (READ_ONCE(txq->trans_start) != now)
+ WRITE_ONCE(txq->trans_start, now);
}
/* legacy drivers only, netdev_start_xmit() sets txq->trans_start */
@@ -4106,8 +4117,7 @@ static inline void netif_trans_update(struct net_device *dev)
{
struct netdev_queue *txq = netdev_get_tx_queue(dev, 0);
- if (txq->trans_start != jiffies)
- txq->trans_start = jiffies;
+ txq_trans_cond_update(txq);
}
/**
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 1b4328bd495d54d44a9d51b53c8e8bc18b9cc294..02c46041f76e85571fd2862e02fb409bfd8e6611 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -434,9 +434,9 @@ unsigned long dev_trans_start(struct net_device *dev)
dev = vlan_dev_real_dev(dev);
else if (netif_is_macvlan(dev))
dev = macvlan_dev_real_dev(dev);
- res = netdev_get_tx_queue(dev, 0)->trans_start;
+ res = READ_ONCE(netdev_get_tx_queue(dev, 0)->trans_start);
for (i = 1; i < dev->num_tx_queues; i++) {
- val = netdev_get_tx_queue(dev, i)->trans_start;
+ val = READ_ONCE(netdev_get_tx_queue(dev, i)->trans_start);
if (val && time_after(val, res))
res = val;
}
@@ -462,7 +462,7 @@ static void dev_watchdog(struct timer_list *t)
struct netdev_queue *txq;
txq = netdev_get_tx_queue(dev, i);
- trans_start = txq->trans_start;
+ trans_start = READ_ONCE(txq->trans_start);
if (netif_xmit_stopped(txq) &&
time_after(jiffies, (trans_start +
dev->watchdog_timeo))) {
@@ -1148,7 +1148,7 @@ static void transition_one_qdisc(struct net_device *dev,
rcu_assign_pointer(dev_queue->qdisc, new_qdisc);
if (need_watchdog_p) {
- dev_queue->trans_start = 0;
+ WRITE_ONCE(dev_queue->trans_start, 0);
*need_watchdog_p = 1;
}
}
--
2.34.0.rc1.387.gb447b232ab-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next 3/4] net: do not inline netif_tx_lock()/netif_tx_unlock()
2021-11-17 3:29 [PATCH net-next 0/4] net: make dev_watchdog() less intrusive Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 1/4] net: use an atomic_long_t for queue->trans_timeout Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 2/4] net: annotate accesses to queue->trans_start Eric Dumazet
@ 2021-11-17 3:29 ` Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 4/4] net: no longer stop all TX queues in dev_watchdog() Eric Dumazet
2021-11-17 15:00 ` [PATCH net-next 0/4] net: make dev_watchdog() less intrusive patchwork-bot+netdevbpf
4 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-11-17 3:29 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
These are not fast path, there is no point in inlining them.
Also provide netif_freeze_queues()/netif_unfreeze_queues()
so that we can use them from dev_watchdog() in the following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/netdevice.h | 39 ++----------------------------
net/sched/sch_generic.c | 51 +++++++++++++++++++++++++++++++++++++++
2 files changed, 53 insertions(+), 37 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 83e6204c0ba3491b56eec5c7f94e55eab7159223..28e79ef5ca06f66a788ce3e3f59d158be9150332 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4126,27 +4126,7 @@ static inline void netif_trans_update(struct net_device *dev)
*
* Get network device transmit lock
*/
-static inline void netif_tx_lock(struct net_device *dev)
-{
- unsigned int i;
- int cpu;
-
- spin_lock(&dev->tx_global_lock);
- cpu = smp_processor_id();
- for (i = 0; i < dev->num_tx_queues; i++) {
- struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
-
- /* We are the only thread of execution doing a
- * freeze, but we have to grab the _xmit_lock in
- * order to synchronize with threads which are in
- * the ->hard_start_xmit() handler and already
- * checked the frozen bit.
- */
- __netif_tx_lock(txq, cpu);
- set_bit(__QUEUE_STATE_FROZEN, &txq->state);
- __netif_tx_unlock(txq);
- }
-}
+void netif_tx_lock(struct net_device *dev);
static inline void netif_tx_lock_bh(struct net_device *dev)
{
@@ -4154,22 +4134,7 @@ static inline void netif_tx_lock_bh(struct net_device *dev)
netif_tx_lock(dev);
}
-static inline void netif_tx_unlock(struct net_device *dev)
-{
- unsigned int i;
-
- for (i = 0; i < dev->num_tx_queues; i++) {
- struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
-
- /* No need to grab the _xmit_lock here. If the
- * queue is not stopped for another reason, we
- * force a schedule.
- */
- clear_bit(__QUEUE_STATE_FROZEN, &txq->state);
- netif_schedule_queue(txq);
- }
- spin_unlock(&dev->tx_global_lock);
-}
+void netif_tx_unlock(struct net_device *dev);
static inline void netif_tx_unlock_bh(struct net_device *dev)
{
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 02c46041f76e85571fd2862e02fb409bfd8e6611..389e0d8fc68d12cf092a975511729a8dae1b29fb 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -445,6 +445,57 @@ unsigned long dev_trans_start(struct net_device *dev)
}
EXPORT_SYMBOL(dev_trans_start);
+static void netif_freeze_queues(struct net_device *dev)
+{
+ unsigned int i;
+ int cpu;
+
+ cpu = smp_processor_id();
+ for (i = 0; i < dev->num_tx_queues; i++) {
+ struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
+
+ /* We are the only thread of execution doing a
+ * freeze, but we have to grab the _xmit_lock in
+ * order to synchronize with threads which are in
+ * the ->hard_start_xmit() handler and already
+ * checked the frozen bit.
+ */
+ __netif_tx_lock(txq, cpu);
+ set_bit(__QUEUE_STATE_FROZEN, &txq->state);
+ __netif_tx_unlock(txq);
+ }
+}
+
+void netif_tx_lock(struct net_device *dev)
+{
+ spin_lock(&dev->tx_global_lock);
+ netif_freeze_queues(dev);
+}
+EXPORT_SYMBOL(netif_tx_lock);
+
+static void netif_unfreeze_queues(struct net_device *dev)
+{
+ unsigned int i;
+
+ for (i = 0; i < dev->num_tx_queues; i++) {
+ struct netdev_queue *txq = netdev_get_tx_queue(dev, i);
+
+ /* No need to grab the _xmit_lock here. If the
+ * queue is not stopped for another reason, we
+ * force a schedule.
+ */
+ clear_bit(__QUEUE_STATE_FROZEN, &txq->state);
+ netif_schedule_queue(txq);
+ }
+}
+
+void netif_tx_unlock(struct net_device *dev)
+{
+ netif_unfreeze_queues(dev);
+ spin_unlock(&dev->tx_global_lock);
+}
+EXPORT_SYMBOL(netif_tx_unlock);
+
static void dev_watchdog(struct timer_list *t)
{
struct net_device *dev = from_timer(dev, t, watchdog_timer);
--
2.34.0.rc1.387.gb447b232ab-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH net-next 4/4] net: no longer stop all TX queues in dev_watchdog()
2021-11-17 3:29 [PATCH net-next 0/4] net: make dev_watchdog() less intrusive Eric Dumazet
` (2 preceding siblings ...)
2021-11-17 3:29 ` [PATCH net-next 3/4] net: do not inline netif_tx_lock()/netif_tx_unlock() Eric Dumazet
@ 2021-11-17 3:29 ` Eric Dumazet
2021-11-17 15:00 ` [PATCH net-next 0/4] net: make dev_watchdog() less intrusive patchwork-bot+netdevbpf
4 siblings, 0 replies; 7+ messages in thread
From: Eric Dumazet @ 2021-11-17 3:29 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski; +Cc: netdev, Eric Dumazet, Eric Dumazet
From: Eric Dumazet <edumazet@google.com>
There is no reason for stopping all TX queues from dev_watchdog()
Not only this stops feeding the NIC, it also migrates all qdiscs
to be serviced on the cpu calling netif_tx_unlock(), causing
a potential latency artifact.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/sched/sch_generic.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 389e0d8fc68d12cf092a975511729a8dae1b29fb..d33804d41c5c5a9047c808fd37ba65ae8875fc79 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -500,7 +500,7 @@ static void dev_watchdog(struct timer_list *t)
{
struct net_device *dev = from_timer(dev, t, watchdog_timer);
- netif_tx_lock(dev);
+ spin_lock(&dev->tx_global_lock);
if (!qdisc_tx_is_noop(dev)) {
if (netif_device_present(dev) &&
netif_running(dev) &&
@@ -523,11 +523,13 @@ static void dev_watchdog(struct timer_list *t)
}
}
- if (some_queue_timedout) {
+ if (unlikely(some_queue_timedout)) {
trace_net_dev_xmit_timeout(dev, i);
WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n",
dev->name, netdev_drivername(dev), i);
+ netif_freeze_queues(dev);
dev->netdev_ops->ndo_tx_timeout(dev, i);
+ netif_unfreeze_queues(dev);
}
if (!mod_timer(&dev->watchdog_timer,
round_jiffies(jiffies +
@@ -535,7 +537,7 @@ static void dev_watchdog(struct timer_list *t)
dev_hold(dev);
}
}
- netif_tx_unlock(dev);
+ spin_unlock(&dev->tx_global_lock);
dev_put(dev);
}
--
2.34.0.rc1.387.gb447b232ab-goog
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 0/4] net: make dev_watchdog() less intrusive
2021-11-17 3:29 [PATCH net-next 0/4] net: make dev_watchdog() less intrusive Eric Dumazet
` (3 preceding siblings ...)
2021-11-17 3:29 ` [PATCH net-next 4/4] net: no longer stop all TX queues in dev_watchdog() Eric Dumazet
@ 2021-11-17 15:00 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-11-17 15:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: davem, kuba, netdev, edumazet
Hello:
This series was applied to netdev/net-next.git (master)
by David S. Miller <davem@davemloft.net>:
On Tue, 16 Nov 2021 19:29:20 -0800 you wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> dev_watchdog() is used on many NIC to periodically monitor TX queues
> to detect hangs.
>
> Problem is : It stops all queues, then check them, then 'unfreeze' them.
>
> [...]
Here is the summary with links:
- [net-next,1/4] net: use an atomic_long_t for queue->trans_timeout
https://git.kernel.org/netdev/net-next/c/8160fb43d55d
- [net-next,2/4] net: annotate accesses to queue->trans_start
https://git.kernel.org/netdev/net-next/c/5337824f4dc4
- [net-next,3/4] net: do not inline netif_tx_lock()/netif_tx_unlock()
https://git.kernel.org/netdev/net-next/c/dab8fe320726
- [net-next,4/4] net: no longer stop all TX queues in dev_watchdog()
https://git.kernel.org/netdev/net-next/c/bec251bc8b6a
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net-next 2/4] net: annotate accesses to queue->trans_start
2021-11-17 3:29 ` [PATCH net-next 2/4] net: annotate accesses to queue->trans_start Eric Dumazet
@ 2021-11-18 18:27 ` kernel test robot
0 siblings, 0 replies; 7+ messages in thread
From: kernel test robot @ 2021-11-18 18:27 UTC (permalink / raw)
To: kbuild-all
[-- Attachment #1: Type: text/plain, Size: 6808 bytes --]
Hi Eric,
I love your patch! Perhaps something to improve:
[auto build test WARNING on net-next/master]
url: https://github.com/0day-ci/linux/commits/Eric-Dumazet/net-make-dev_watchdog-less-intrusive/20211117-113056
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 62803fec52f80e4dd375de2dd76510c405792928
config: mips-allyesconfig (attached as .config)
compiler: mips-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/461b045f634f4e17925ad5622adadf9cbf474cec
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Eric-Dumazet/net-make-dev_watchdog-less-intrusive/20211117-113056
git checkout 461b045f634f4e17925ad5622adadf9cbf474cec
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross ARCH=mips
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All warnings (new ones prefixed by >>):
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_xdp_xmit_zc':
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:2359:33: warning: passing argument 1 of 'txq_trans_cond_update' makes pointer from integer without a cast [-Wint-conversion]
2359 | txq_trans_cond_update(nq->trans_start);
| ~~^~~~~~~~~~~~~
| |
| long unsigned int
In file included from include/net/sock.h:46,
from include/linux/tcp.h:19,
from drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:21:
include/linux/netdevice.h:4107:63: note: expected 'struct netdev_queue *' but argument is of type 'long unsigned int'
4107 | static inline void txq_trans_cond_update(struct netdev_queue *txq)
| ~~~~~~~~~~~~~~~~~~~~~^~~
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_xdp_xmit_back':
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:4660:33: warning: passing argument 1 of 'txq_trans_cond_update' makes pointer from integer without a cast [-Wint-conversion]
4660 | txq_trans_cond_update(nq->trans_start);
| ~~^~~~~~~~~~~~~
| |
| long unsigned int
In file included from include/net/sock.h:46,
from include/linux/tcp.h:19,
from drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:21:
include/linux/netdevice.h:4107:63: note: expected 'struct netdev_queue *' but argument is of type 'long unsigned int'
4107 | static inline void txq_trans_cond_update(struct netdev_queue *txq)
| ~~~~~~~~~~~~~~~~~~~~~^~~
vim +/txq_trans_cond_update +2359 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
2347
2348 static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget)
2349 {
2350 struct netdev_queue *nq = netdev_get_tx_queue(priv->dev, queue);
2351 struct stmmac_tx_queue *tx_q = &priv->tx_queue[queue];
2352 struct xsk_buff_pool *pool = tx_q->xsk_pool;
2353 unsigned int entry = tx_q->cur_tx;
2354 struct dma_desc *tx_desc = NULL;
2355 struct xdp_desc xdp_desc;
2356 bool work_done = true;
2357
2358 /* Avoids TX time-out as we are sharing with slow path */
> 2359 txq_trans_cond_update(nq->trans_start);
2360
2361 budget = min(budget, stmmac_tx_avail(priv, queue));
2362
2363 while (budget-- > 0) {
2364 dma_addr_t dma_addr;
2365 bool set_ic;
2366
2367 /* We are sharing with slow path and stop XSK TX desc submission when
2368 * available TX ring is less than threshold.
2369 */
2370 if (unlikely(stmmac_tx_avail(priv, queue) < STMMAC_TX_XSK_AVAIL) ||
2371 !netif_carrier_ok(priv->dev)) {
2372 work_done = false;
2373 break;
2374 }
2375
2376 if (!xsk_tx_peek_desc(pool, &xdp_desc))
2377 break;
2378
2379 if (likely(priv->extend_desc))
2380 tx_desc = (struct dma_desc *)(tx_q->dma_etx + entry);
2381 else if (tx_q->tbs & STMMAC_TBS_AVAIL)
2382 tx_desc = &tx_q->dma_entx[entry].basic;
2383 else
2384 tx_desc = tx_q->dma_tx + entry;
2385
2386 dma_addr = xsk_buff_raw_get_dma(pool, xdp_desc.addr);
2387 xsk_buff_raw_dma_sync_for_device(pool, dma_addr, xdp_desc.len);
2388
2389 tx_q->tx_skbuff_dma[entry].buf_type = STMMAC_TXBUF_T_XSK_TX;
2390
2391 /* To return XDP buffer to XSK pool, we simple call
2392 * xsk_tx_completed(), so we don't need to fill up
2393 * 'buf' and 'xdpf'.
2394 */
2395 tx_q->tx_skbuff_dma[entry].buf = 0;
2396 tx_q->xdpf[entry] = NULL;
2397
2398 tx_q->tx_skbuff_dma[entry].map_as_page = false;
2399 tx_q->tx_skbuff_dma[entry].len = xdp_desc.len;
2400 tx_q->tx_skbuff_dma[entry].last_segment = true;
2401 tx_q->tx_skbuff_dma[entry].is_jumbo = false;
2402
2403 stmmac_set_desc_addr(priv, tx_desc, dma_addr);
2404
2405 tx_q->tx_count_frames++;
2406
2407 if (!priv->tx_coal_frames[queue])
2408 set_ic = false;
2409 else if (tx_q->tx_count_frames % priv->tx_coal_frames[queue] == 0)
2410 set_ic = true;
2411 else
2412 set_ic = false;
2413
2414 if (set_ic) {
2415 tx_q->tx_count_frames = 0;
2416 stmmac_set_tx_ic(priv, tx_desc);
2417 priv->xstats.tx_set_ic_bit++;
2418 }
2419
2420 stmmac_prepare_tx_desc(priv, tx_desc, 1, xdp_desc.len,
2421 true, priv->mode, true, true,
2422 xdp_desc.len);
2423
2424 stmmac_enable_dma_transmission(priv, priv->ioaddr);
2425
2426 tx_q->cur_tx = STMMAC_GET_ENTRY(tx_q->cur_tx, priv->dma_tx_size);
2427 entry = tx_q->cur_tx;
2428 }
2429
2430 if (tx_desc) {
2431 stmmac_flush_tx_descriptors(priv, queue);
2432 xsk_tx_release(pool);
2433 }
2434
2435 /* Return true if all of the 3 conditions are met
2436 * a) TX Budget is still available
2437 * b) work_done = true when XSK TX desc peek is empty (no more
2438 * pending XSK TX for transmission)
2439 */
2440 return !!budget && work_done;
2441 }
2442
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org
[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 72397 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2021-11-18 18:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-17 3:29 [PATCH net-next 0/4] net: make dev_watchdog() less intrusive Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 1/4] net: use an atomic_long_t for queue->trans_timeout Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 2/4] net: annotate accesses to queue->trans_start Eric Dumazet
2021-11-18 18:27 ` kernel test robot
2021-11-17 3:29 ` [PATCH net-next 3/4] net: do not inline netif_tx_lock()/netif_tx_unlock() Eric Dumazet
2021-11-17 3:29 ` [PATCH net-next 4/4] net: no longer stop all TX queues in dev_watchdog() Eric Dumazet
2021-11-17 15:00 ` [PATCH net-next 0/4] net: make dev_watchdog() less intrusive patchwork-bot+netdevbpf
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.