From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Martin Ottens <martin.ottens@fau.de>,
Jamal Hadi Salim <jhs@mojatatu.com>,
Jakub Kicinski <kuba@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.1 59/76] net/sched: netem: account for backlog updates from child qdisc
Date: Tue, 17 Dec 2024 18:07:39 +0100 [thread overview]
Message-ID: <20241217170528.720423835@linuxfoundation.org> (raw)
In-Reply-To: <20241217170526.232803729@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin Ottens <martin.ottens@fau.de>
[ Upstream commit f8d4bc455047cf3903cd6f85f49978987dbb3027 ]
In general, 'qlen' of any classful qdisc should keep track of the
number of packets that the qdisc itself and all of its children holds.
In case of netem, 'qlen' only accounts for the packets in its internal
tfifo. When netem is used with a child qdisc, the child qdisc can use
'qdisc_tree_reduce_backlog' to inform its parent, netem, about created
or dropped SKBs. This function updates 'qlen' and the backlog statistics
of netem, but netem does not account for changes made by a child qdisc.
'qlen' then indicates the wrong number of packets in the tfifo.
If a child qdisc creates new SKBs during enqueue and informs its parent
about this, netem's 'qlen' value is increased. When netem dequeues the
newly created SKBs from the child, the 'qlen' in netem is not updated.
If 'qlen' reaches the configured sch->limit, the enqueue function stops
working, even though the tfifo is not full.
Reproduce the bug:
Ensure that the sender machine has GSO enabled. Configure netem as root
qdisc and tbf as its child on the outgoing interface of the machine
as follows:
$ tc qdisc add dev <oif> root handle 1: netem delay 100ms limit 100
$ tc qdisc add dev <oif> parent 1:0 tbf rate 50Mbit burst 1542 latency 50ms
Send bulk TCP traffic out via this interface, e.g., by running an iPerf3
client on the machine. Check the qdisc statistics:
$ tc -s qdisc show dev <oif>
Statistics after 10s of iPerf3 TCP test before the fix (note that
netem's backlog > limit, netem stopped accepting packets):
qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
Sent 2767766 bytes 1848 pkt (dropped 652, overlimits 0 requeues 0)
backlog 4294528236b 1155p requeues 0
qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
Sent 2767766 bytes 1848 pkt (dropped 327, overlimits 7601 requeues 0)
backlog 0b 0p requeues 0
Statistics after the fix:
qdisc netem 1: root refcnt 2 limit 1000 delay 100ms
Sent 37766372 bytes 24974 pkt (dropped 9, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
qdisc tbf 10: parent 1:1 rate 50Mbit burst 1537b lat 50ms
Sent 37766372 bytes 24974 pkt (dropped 327, overlimits 96017 requeues 0)
backlog 0b 0p requeues 0
tbf segments the GSO SKBs (tbf_segment) and updates the netem's 'qlen'.
The interface fully stops transferring packets and "locks". In this case,
the child qdisc and tfifo are empty, but 'qlen' indicates the tfifo is at
its limit and no more packets are accepted.
This patch adds a counter for the entries in the tfifo. Netem's 'qlen' is
only decreased when a packet is returned by its dequeue function, and not
during enqueuing into the child qdisc. External updates to 'qlen' are thus
accounted for and only the behavior of the backlog statistics changes. As
in other qdiscs, 'qlen' then keeps track of how many packets are held in
netem and all of its children. As before, sch->limit remains as the
maximum number of packets in the tfifo. The same applies to netem's
backlog statistics.
Fixes: 50612537e9ab ("netem: fix classful handling")
Signed-off-by: Martin Ottens <martin.ottens@fau.de>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20241210131412.1837202-1-martin.ottens@fau.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/sched/sch_netem.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 0eba06613dcd..f47ab622399f 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -77,6 +77,8 @@ struct netem_sched_data {
struct sk_buff *t_head;
struct sk_buff *t_tail;
+ u32 t_len;
+
/* optional qdisc for classful handling (NULL at netem init) */
struct Qdisc *qdisc;
@@ -373,6 +375,7 @@ static void tfifo_reset(struct Qdisc *sch)
rtnl_kfree_skbs(q->t_head, q->t_tail);
q->t_head = NULL;
q->t_tail = NULL;
+ q->t_len = 0;
}
static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
@@ -402,6 +405,7 @@ static void tfifo_enqueue(struct sk_buff *nskb, struct Qdisc *sch)
rb_link_node(&nskb->rbnode, parent, p);
rb_insert_color(&nskb->rbnode, &q->t_root);
}
+ q->t_len++;
sch->q.qlen++;
}
@@ -508,7 +512,7 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
1<<prandom_u32_max(8);
}
- if (unlikely(sch->q.qlen >= sch->limit)) {
+ if (unlikely(q->t_len >= sch->limit)) {
/* re-link segs, so that qdisc_drop_all() frees them all */
skb->next = segs;
qdisc_drop_all(skb, sch, to_free);
@@ -692,8 +696,8 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
tfifo_dequeue:
skb = __qdisc_dequeue_head(&sch->q);
if (skb) {
- qdisc_qstats_backlog_dec(sch, skb);
deliver:
+ qdisc_qstats_backlog_dec(sch, skb);
qdisc_bstats_update(sch, skb);
return skb;
}
@@ -709,8 +713,7 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
if (time_to_send <= now && q->slot.slot_next <= now) {
netem_erase_head(q, skb);
- sch->q.qlen--;
- qdisc_qstats_backlog_dec(sch, skb);
+ q->t_len--;
skb->next = NULL;
skb->prev = NULL;
/* skb->dev shares skb->rbnode area,
@@ -737,16 +740,21 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
if (net_xmit_drop_count(err))
qdisc_qstats_drop(sch);
qdisc_tree_reduce_backlog(sch, 1, pkt_len);
+ sch->qstats.backlog -= pkt_len;
+ sch->q.qlen--;
}
goto tfifo_dequeue;
}
+ sch->q.qlen--;
goto deliver;
}
if (q->qdisc) {
skb = q->qdisc->ops->dequeue(q->qdisc);
- if (skb)
+ if (skb) {
+ sch->q.qlen--;
goto deliver;
+ }
}
qdisc_watchdog_schedule_ns(&q->watchdog,
@@ -756,8 +764,10 @@ static struct sk_buff *netem_dequeue(struct Qdisc *sch)
if (q->qdisc) {
skb = q->qdisc->ops->dequeue(q->qdisc);
- if (skb)
+ if (skb) {
+ sch->q.qlen--;
goto deliver;
+ }
}
return NULL;
}
--
2.39.5
next prev parent reply other threads:[~2024-12-17 17:17 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-17 17:06 [PATCH 6.1 00/76] 6.1.121-rc1 review Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 01/76] bpf: Fix UAF via mismatching bpf_prog/attachment RCU flavors Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 02/76] ksmbd: fix racy issue from session lookup and expire Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 03/76] tcp: check space before adding MPTCP SYN options Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 04/76] blk-cgroup: Fix UAF in blkcg_unpin_online() Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 05/76] ALSA: usb-audio: Add implicit feedback quirk for Yamaha THR5 Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 06/76] usb: host: max3421-hcd: Correctly abort a USB request Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 07/76] ata: sata_highbank: fix OF node reference leak in highbank_initialize_phys() Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 08/76] usb: dwc2: Fix HCD resume Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 09/76] usb: dwc2: hcd: Fix GetPortStatus & SetPortFeature Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 10/76] usb: dwc2: Fix HCD port connection race Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 11/76] usb: ehci-hcd: fix call balance of clocks handling routines Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 12/76] usb: typec: anx7411: fix fwnode_handle reference leak Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 13/76] usb: typec: anx7411: fix OF node reference leaks in anx7411_typec_switch_probe() Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 14/76] usb: gadget: u_serial: Fix the issue that gs_start_io crashed due to accessing null pointer Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 15/76] usb: dwc3: xilinx: make sure pipe clock is deselected in usb2 only mode Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 16/76] drm/i915: Fix memory leak by correcting cache object name in error handler Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 17/76] xfs: update btree keys correctly when _insrec splits an inode root block Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 18/76] xfs: dont drop errno values when we fail to ficlone the entire range Greg Kroah-Hartman
2024-12-17 17:06 ` [PATCH 6.1 19/76] xfs: return from xfs_symlink_verify early on V4 filesystems Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 20/76] xfs: fix scrub tracepoints when inode-rooted btrees are involved Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 21/76] xfs: only run precommits once per transaction object Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 22/76] bpf,perf: Fix invalid prog_array access in perf_event_detach_bpf_prog Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 23/76] bpf, sockmap: Fix update element with same Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 24/76] smb: client: fix UAF in smb2_reconnect_server() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 25/76] exfat: support dynamic allocate bh for exfat_entry_set_cache Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 26/76] exfat: fix potential deadlock on __exfat_get_dentry_set Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 27/76] wifi: nl80211: fix NL80211_ATTR_MLO_LINK_ID off-by-one Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 28/76] wifi: mac80211: clean up ret in sta_link_apply_parameters() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 29/76] wifi: mac80211: fix station NSS capability initialization order Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 30/76] acpi: nfit: vmalloc-out-of-bounds Read in acpi_nfit_ctl Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 31/76] amdgpu/uvd: get ring reference from rq scheduler Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 32/76] batman-adv: Do not send uninitialized TT changes Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 33/76] batman-adv: Remove uninitialized data in full table TT response Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 34/76] batman-adv: Do not let TT changes list grows indefinitely Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 35/76] tipc: fix NULL deref in cleanup_bearer() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 36/76] net/mlx5: DR, prevent potential error pointer dereference Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 37/76] selftests: mlxsw: sharedbuffer: Remove h1 ingress test case Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 38/76] selftests: mlxsw: sharedbuffer: Remove duplicate test cases Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 39/76] selftests: mlxsw: sharedbuffer: Ensure no extra packets are counted Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 40/76] ptp: kvm: Use decrypted memory in confidential guest on x86 Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 41/76] ptp: kvm: x86: Return EOPNOTSUPP instead of ENODEV from kvm_arch_ptp_init() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 42/76] net: lapb: increase LAPB_HEADER_LEN Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 43/76] net: defer final struct net free in netns dismantle Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 44/76] net: mscc: ocelot: fix memory leak on ocelot_port_add_txtstamp_skb() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 45/76] net: mscc: ocelot: improve handling of TX timestamp for unknown skb Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 46/76] net: mscc: ocelot: ocelot->ts_id_lock and ocelot_port->tx_skbs.lock are IRQ-safe Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 47/76] net: mscc: ocelot: be resilient to loss of PTP packets during transmission Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 48/76] net: mscc: ocelot: perform error cleanup in ocelot_hwstamp_set() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 49/76] spi: aspeed: Fix an error handling path in aspeed_spi_[read|write]_user() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 50/76] net: sparx5: fix FDMA performance issue Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 51/76] net: sparx5: fix the maximum frame length register Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 52/76] ACPI: resource: Fix memory resource type union access Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 53/76] cxgb4: use port number to set mac addr Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 54/76] qca_spi: Fix clock speed for multiple QCA7000 Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 55/76] qca_spi: Make driver probing reliable Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 56/76] ASoC: amd: yc: Fix the wrong return value Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 57/76] Documentation: PM: Clarify pm_runtime_resume_and_get() " Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 58/76] net: dsa: felix: fix stuck CPU-injected packets with short taprio windows Greg Kroah-Hartman
2024-12-17 17:07 ` Greg Kroah-Hartman [this message]
2024-12-17 17:07 ` [PATCH 6.1 60/76] bonding: Fix feature propagation of NETIF_F_GSO_ENCAP_ALL Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 61/76] team: " Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 62/76] ACPICA: events/evxfregn: dont release the ContextMutex that was never acquired Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 63/76] Bluetooth: iso: Fix recursive locking warning Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 64/76] Bluetooth: SCO: Add support for 16 bits transparent voice setting Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 65/76] blk-iocost: Avoid using clamp() on inuse in __propagate_weights() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 66/76] bpf: sync_linked_regs() must preserve subreg_def Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 67/76] tracing/kprobes: Skip symbol counting logic for module symbols in create_local_trace_kprobe() Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 68/76] xen/netfront: fix crash when removing device Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 69/76] x86: make get_cpu_vendor() accessible from Xen code Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 70/76] objtool/x86: allow syscall instruction Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 71/76] x86/static-call: provide a way to do very early static-call updates Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 72/76] x86/xen: dont do PV iret hypercall through hypercall page Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 73/76] x86/xen: add central hypercall functions Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 74/76] x86/xen: use new hypercall functions instead of hypercall page Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 75/76] x86/xen: remove " Greg Kroah-Hartman
2024-12-17 17:07 ` [PATCH 6.1 76/76] ALSA: usb-audio: Fix a DMA to stack memory bug Greg Kroah-Hartman
2024-12-17 19:55 ` [PATCH 6.1 00/76] 6.1.121-rc1 review Florian Fainelli
2024-12-17 21:27 ` Pavel Machek
2024-12-17 23:03 ` Shuah Khan
2024-12-18 6:55 ` Ron Economos
2024-12-18 11:35 ` Peter Schneider
2024-12-18 12:49 ` Mark Brown
2024-12-18 15:46 ` Naresh Kamboju
2024-12-18 17:21 ` Jon Hunter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241217170528.720423835@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=jhs@mojatatu.com \
--cc=kuba@kernel.org \
--cc=martin.ottens@fau.de \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox