From: Petr Machata <petrm@nvidia.com>
To: Davide Caratti <dcaratti@redhat.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
Jiri Pirko <jiri@resnulli.us>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Lion Ackermann <nnamrec@gmail.com>,
Petr Machata <petrm@mellanox.com>, <netdev@vger.kernel.org>,
Ivan Vecera <ivecera@redhat.com>, Li Shuang <shuali@redhat.com>
Subject: Re: [PATCH net] net/sched: ets: use old 'nbands' while purging unused classes
Date: Fri, 8 Aug 2025 13:44:59 +0200 [thread overview]
Message-ID: <87ms8a9fuv.fsf@nvidia.com> (raw)
In-Reply-To: <f3b9bacc73145f265c19ab80785933da5b7cbdec.1754581577.git.dcaratti@redhat.com>
Davide Caratti <dcaratti@redhat.com> writes:
> Shuang reported sch_ets test-case [1] crashing in ets_class_qlen_notify()
> after recent changes from Lion [2]. The problem is: in ets_qdisc_change()
> we purge unused DWRR queues; the value of 'q->nbands' is the new one, and
> the cleanup should be done with the old one. The problem is here since my
> first attempts to fix ets_qdisc_change(), but it surfaced again after the
> recent qdisc len accounting fixes. Fix it purging idle DWRR queues before
> assigning a new value of 'q->nbands', so that all purge operations find a
> consistent configuration:
First let me just note: that is one sharp paragraph edge!
> - old 'q->nbands' because it's needed by ets_class_find()
Comes from qdisc_purge_queue() calls qdisc_tree_reduce_backlog() calls
Qdisc_class_ops.find, which is ets_class_find().
> - old 'q->nstrict' because it's needed by ets_class_is_strict()
From qdisc_purge_queue() calls qdisc_tree_reduce_backlog() calls
Qdisc_class_ops.qlen_notify, which is ets_class_qlen_notify() calls
ets_class_is_strict().
Also qdisc_purge_queue() calls qdisc_reset() calls Qdisc_ops.reset which
is ets_qdisc_reset(), and that accesses both fields.
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0000 [#1] SMP NOPTI
> CPU: 62 UID: 0 PID: 39457 Comm: tc Kdump: loaded Not tainted 6.12.0-116.el10.x86_64 #1 PREEMPT(voluntary)
> Hardware name: Dell Inc. PowerEdge R640/06DKY5, BIOS 2.12.2 07/09/2021
> RIP: 0010:__list_del_entry_valid_or_report+0x4/0x80
> Code: ff 4c 39 c7 0f 84 39 19 8e ff b8 01 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa <48> 8b 17 48 8b 4f 08 48 85 d2 0f 84 56 19 8e ff 48 85 c9 0f 84 ab
> RSP: 0018:ffffba186009f400 EFLAGS: 00010202
> RAX: 00000000000000d6 RBX: 0000000000000000 RCX: 0000000000000004
> RDX: ffff9f0fa29b69c0 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: ffffffffc12c2400 R08: 0000000000000008 R09: 0000000000000004
> R10: ffffffffffffffff R11: 0000000000000004 R12: 0000000000000000
> R13: ffff9f0f8cfe0000 R14: 0000000000100005 R15: 0000000000000000
> FS: 00007f2154f37480(0000) GS:ffff9f269c1c0000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 00000001530be001 CR4: 00000000007726f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> <TASK>
> ets_class_qlen_notify+0x65/0x90 [sch_ets]
> qdisc_tree_reduce_backlog+0x74/0x110
> ets_qdisc_change+0x630/0xa40 [sch_ets]
> __tc_modify_qdisc.constprop.0+0x216/0x7f0
> tc_modify_qdisc+0x7c/0x120
> rtnetlink_rcv_msg+0x145/0x3f0
> netlink_rcv_skb+0x53/0x100
> netlink_unicast+0x245/0x390
> netlink_sendmsg+0x21b/0x470
> ____sys_sendmsg+0x39d/0x3d0
> ___sys_sendmsg+0x9a/0xe0
> __sys_sendmsg+0x7a/0xd0
> do_syscall_64+0x7d/0x160
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f2155114084
> Code: 89 02 b8 ff ff ff ff eb bb 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 80 3d 25 f0 0c 00 00 74 13 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 89 54 24 1c 48 89
> RSP: 002b:00007fff1fd7a988 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000560ec063e5e0 RCX: 00007f2155114084
> RDX: 0000000000000000 RSI: 00007fff1fd7a9f0 RDI: 0000000000000003
> RBP: 00007fff1fd7aa60 R08: 0000000000000010 R09: 000000000000003f
> R10: 0000560ee9b3a010 R11: 0000000000000202 R12: 00007fff1fd7aae0
> R13: 000000006891ccde R14: 0000560ec063e5e0 R15: 00007fff1fd7aad0
> </TASK>
>
> [1] https://lore.kernel.org/netdev/e08c7f4a6882f260011909a868311c6e9b54f3e4.1639153474.git.dcaratti@redhat.com/
> [2] https://lore.kernel.org/netdev/d912cbd7-193b-4269-9857-525bee8bbb6a@gmail.com/
>
> Fixes: 103406b38c60 ("net/sched: Always pass notifications when child class becomes empty")
> Fixes: c062f2a0b04d ("net/sched: sch_ets: don't remove idle classes from the round-robin list")
> Fixes: dcc68b4d8084 ("net: sch_ets: Add a new Qdisc")
> Reported-by: Li Shuang <shuali@redhat.com>
> Closes: https://issues.redhat.com/browse/RHEL-108026
> Co-developed-by: Ivan Vecera <ivecera@redhat.com>
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
> ---
> net/sched/sch_ets.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
> index 037f764822b9..82635dd2cfa5 100644
> --- a/net/sched/sch_ets.c
> +++ b/net/sched/sch_ets.c
> @@ -651,6 +651,12 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
>
> sch_tree_lock(sch);
>
> + for (i = nbands; i < oldbands; i++) {
> + if (i >= q->nstrict && q->classes[i].qdisc->q.qlen)
> + list_del_init(&q->classes[i].alist);
> + qdisc_purge_queue(q->classes[i].qdisc);
> + }
> +
> WRITE_ONCE(q->nbands, nbands);
> for (i = nstrict; i < q->nstrict; i++) {
> if (q->classes[i].qdisc->q.qlen) {
> @@ -658,11 +664,6 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
> q->classes[i].deficit = quanta[i];
> }
> }
> - for (i = q->nbands; i < oldbands; i++) {
> - if (i >= q->nstrict && q->classes[i].qdisc->q.qlen)
> - list_del_init(&q->classes[i].alist);
> - qdisc_purge_queue(q->classes[i].qdisc);
> - }
> WRITE_ONCE(q->nstrict, nstrict);
> memcpy(q->prio2band, priomap, sizeof(priomap));
next prev parent reply other threads:[~2025-08-08 12:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-07 15:48 [PATCH net] net/sched: ets: use old 'nbands' while purging unused classes Davide Caratti
2025-08-08 11:44 ` Petr Machata [this message]
2025-08-08 18:15 ` Victor Nogueira
2025-08-11 7:49 ` Davide Caratti
2025-08-11 9:53 ` Victor Nogueira
2025-08-11 13:52 ` Victor Nogueira
2025-08-11 16:09 ` Davide Caratti
2025-08-11 17:35 ` Victor Nogueira
2025-08-12 1:26 ` Jakub Kicinski
2025-08-12 11:21 ` Davide Caratti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ms8a9fuv.fsf@nvidia.com \
--to=petrm@nvidia.com \
--cc=davem@davemloft.net \
--cc=dcaratti@redhat.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=ivecera@redhat.com \
--cc=jhs@mojatatu.com \
--cc=jiri@resnulli.us \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nnamrec@gmail.com \
--cc=pabeni@redhat.com \
--cc=petrm@mellanox.com \
--cc=shuali@redhat.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.