netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op
@ 2025-11-25 22:46 Victor Nogueira
  2025-11-26  3:11 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Victor Nogueira @ 2025-11-25 22:46 UTC (permalink / raw)
  To: davem, kuba, edumazet, pabeni, jhs, jiri, xiyou.wangcong, horms,
	stephen
  Cc: netdev

There is a pattern of bugs that end up creating UAFs or null ptr derefs.
The majority of these bugs follow the formula below:
a) create a nonsense hierarchy of qdiscs which has no practical value,
b) start sending packets
Optional c) netlink cmds to change hierarchy some more; It's more fun if
you can get packets stuck - the formula in this case includes non
work-conserving qdiscs somewhere in the hierarchy
Optional d dependent on c) send more packets
e) profit

Current init/change qdisc APIs are localised to validate only within the
constraint of a single qdisc. So catching #a or #c is a challenge. Our
policy, when said bugs are presented, is to "make it work" by modifying
generally used data structures and code, but these come at the expense of
adding special checks for corner cases which are nonsensical to begin with.

The goal of this patchset is to create an equivalent to PCI quirks, which
will catch nonsensical hierarchies in #a and #c and reject such a config.

With that in mind, we are proposing the addition of a new qdisc op
(quirk_chk). We introduce, as a first example, the quirk_chk op to netem.
Its purpose here is to validate whether the user is attempting to add 2
netem duplicates in the same qdisc tree which will be forbidden unless
the root qdisc is multiqueue.

Here is an example that should now work:

DEV="eth0"
NUM_QUEUES=4
DUPLICATE_PERCENT="5%"

tc qdisc del dev $DEV root > /dev/null 2>&1
tc qdisc add dev $DEV root handle 1: mq

for i in $(seq 1 $NUM_QUEUES); do
    HANDLE_ID=$((i * 10))
    PARENT_ID="1:$i"
    tc qdisc add dev $DEV parent $PARENT_ID handle \
        ${HANDLE_ID}: netem duplicate $DUPLICATE_PERCENT
done

Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
---
v1 -> v2:
- Simplify process of getting root qdisc in netem_quirk_chk
- Use parent's major directly instead of looking up parent qdisc in
  netem_quirk_chk
- Call parse_attrs in netem_quirk_chk to avoid issue caught by syzbot

Link to v1:
https://lore.kernel.org/netdev/20251124223749.503979-1-victor@mojatatu.com/
---
 include/net/sch_generic.h |  3 +++
 net/sched/sch_api.c       | 12 ++++++++++++
 net/sched/sch_netem.c     | 40 +++++++++++++++++++++++++++------------
 3 files changed, 43 insertions(+), 12 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 94966692ccdf..60450372c5d5 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -313,6 +313,9 @@ struct Qdisc_ops {
 						     u32 block_index);
 	void			(*egress_block_set)(struct Qdisc *sch,
 						    u32 block_index);
+	int			(*quirk_chk)(struct Qdisc *sch,
+					     struct nlattr *arg,
+					     struct netlink_ext_ack *extack);
 	u32			(*ingress_block_get)(struct Qdisc *sch);
 	u32			(*egress_block_get)(struct Qdisc *sch);
 
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index f56b18c8aebf..a850df437691 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1315,6 +1315,12 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 		rcu_assign_pointer(sch->stab, stab);
 	}
 
+	if (ops->quirk_chk) {
+		err = ops->quirk_chk(sch, tca[TCA_OPTIONS], extack);
+		if (err != 0)
+			goto err_out3;
+	}
+
 	if (ops->init) {
 		err = ops->init(sch, tca[TCA_OPTIONS], extack);
 		if (err != 0)
@@ -1378,6 +1384,12 @@ static int qdisc_change(struct Qdisc *sch, struct nlattr **tca,
 			NL_SET_ERR_MSG(extack, "Change of blocks is not supported");
 			return -EOPNOTSUPP;
 		}
+		if (sch->ops->quirk_chk) {
+			err = sch->ops->quirk_chk(sch, tca[TCA_OPTIONS],
+						  extack);
+			if (err != 0)
+				return err;
+		}
 		err = sch->ops->change(sch, tca[TCA_OPTIONS], extack);
 		if (err)
 			return err;
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index eafc316ae319..ceece2ae37bc 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -975,13 +975,27 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla,
 
 static const struct Qdisc_class_ops netem_class_ops;
 
-static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
-			       struct netlink_ext_ack *extack)
+static int netem_quirk_chk(struct Qdisc *sch, struct nlattr *opt,
+			   struct netlink_ext_ack *extack)
 {
+	struct nlattr *tb[TCA_NETEM_MAX + 1];
+	struct tc_netem_qopt *qopt;
 	struct Qdisc *root, *q;
+	struct net_device *dev;
+	bool root_is_mq;
+	bool duplicates;
 	unsigned int i;
+	int ret;
+
+	ret = parse_attr(tb, TCA_NETEM_MAX, opt, netem_policy, sizeof(*qopt));
+	if (ret < 0)
+		return ret;
 
-	root = qdisc_root_sleeping(sch);
+	qopt = nla_data(opt);
+	duplicates = qopt->duplicate;
+
+	dev = sch->dev_queue->dev;
+	root = rtnl_dereference(dev->qdisc);
 
 	if (sch != root && root->ops->cl_ops == &netem_class_ops) {
 		if (duplicates ||
@@ -992,19 +1006,25 @@ static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
 	if (!qdisc_dev(root))
 		return 0;
 
+	root_is_mq = root->flags & TCQ_F_MQROOT;
+
 	hash_for_each(qdisc_dev(root)->qdisc_hash, i, q, hash) {
 		if (sch != q && q->ops->cl_ops == &netem_class_ops) {
 			if (duplicates ||
-			    ((struct netem_sched_data *)qdisc_priv(q))->duplicate)
-				goto err;
+			    ((struct netem_sched_data *)qdisc_priv(q))->duplicate) {
+				if (!root_is_mq ||
+				    TC_H_MAJ(q->parent) != root->handle ||
+				    TC_H_MAJ(q->parent) != TC_H_MAJ(sch->parent))
+					goto err;
+			}
 		}
 	}
 
 	return 0;
 
 err:
-	NL_SET_ERR_MSG(extack,
-		       "netem: cannot mix duplicating netems with other netems in tree");
+	NL_SET_ERR_MSG_MOD(extack,
+			   "cannot mix duplicating netems with other netems in tree unless root is multiqueue");
 	return -EINVAL;
 }
 
@@ -1066,11 +1086,6 @@ static int netem_change(struct Qdisc *sch, struct nlattr *opt,
 	q->gap = qopt->gap;
 	q->counter = 0;
 	q->loss = qopt->loss;
-
-	ret = check_netem_in_tree(sch, qopt->duplicate, extack);
-	if (ret)
-		goto unlock;
-
 	q->duplicate = qopt->duplicate;
 
 	/* for compatibility with earlier versions.
@@ -1352,6 +1367,7 @@ static struct Qdisc_ops netem_qdisc_ops __read_mostly = {
 	.destroy	=	netem_destroy,
 	.change		=	netem_change,
 	.dump		=	netem_dump,
+	.quirk_chk	=	netem_quirk_chk,
 	.owner		=	THIS_MODULE,
 };
 MODULE_ALIAS_NET_SCH("netem");
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op
  2025-11-25 22:46 [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op Victor Nogueira
@ 2025-11-26  3:11 ` Stephen Hemminger
  2025-11-26 16:29   ` Jamal Hadi Salim
  2025-11-26  4:02 ` Cong Wang
  2025-11-26  8:16 ` [syzbot ci] " syzbot ci
  2 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2025-11-26  3:11 UTC (permalink / raw)
  To: Victor Nogueira
  Cc: davem, kuba, edumazet, pabeni, jhs, jiri, xiyou.wangcong, horms,
	netdev

On Tue, 25 Nov 2025 19:46:04 -0300
Victor Nogueira <victor@mojatatu.com> wrote:

> There is a pattern of bugs that end up creating UAFs or null ptr derefs.
> The majority of these bugs follow the formula below:
> a) create a nonsense hierarchy of qdiscs which has no practical value,
> b) start sending packets
> Optional c) netlink cmds to change hierarchy some more; It's more fun if
> you can get packets stuck - the formula in this case includes non
> work-conserving qdiscs somewhere in the hierarchy
> Optional d dependent on c) send more packets
> e) profit
> 
> Current init/change qdisc APIs are localised to validate only within the
> constraint of a single qdisc. So catching #a or #c is a challenge. Our
> policy, when said bugs are presented, is to "make it work" by modifying
> generally used data structures and code, but these come at the expense of
> adding special checks for corner cases which are nonsensical to begin with.
> 
> The goal of this patchset is to create an equivalent to PCI quirks, which
> will catch nonsensical hierarchies in #a and #c and reject such a config.
> 
> With that in mind, we are proposing the addition of a new qdisc op
> (quirk_chk). We introduce, as a first example, the quirk_chk op to netem.
> Its purpose here is to validate whether the user is attempting to add 2
> netem duplicates in the same qdisc tree which will be forbidden unless
> the root qdisc is multiqueue.
> 
> Here is an example that should now work:
> 
> DEV="eth0"
> NUM_QUEUES=4
> DUPLICATE_PERCENT="5%"
> 
> tc qdisc del dev $DEV root > /dev/null 2>&1
> tc qdisc add dev $DEV root handle 1: mq
> 
> for i in $(seq 1 $NUM_QUEUES); do
>     HANDLE_ID=$((i * 10))
>     PARENT_ID="1:$i"
>     tc qdisc add dev $DEV parent $PARENT_ID handle \
>         ${HANDLE_ID}: netem duplicate $DUPLICATE_PERCENT
> done
> 
> Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> Signed-off-by: Victor Nogueira <victor@mojatatu.com>
> ---
> v1 -> v2:
> - Simplify process of getting root qdisc in netem_quirk_chk
> - Use parent's major directly instead of looking up parent qdisc in
>   netem_quirk_chk
> - Call parse_attrs in netem_quirk_chk to avoid issue caught by syzbot
> 
> Link to v1:
> https://lore.kernel.org/netdev/20251124223749.503979-1-victor@mojatatu.com/
> ---
>  include/net/sch_generic.h |  3 +++
>  net/sched/sch_api.c       | 12 ++++++++++++
>  net/sched/sch_netem.c     | 40 +++++++++++++++++++++++++++------------
>  3 files changed, 43 insertions(+), 12 deletions(-)
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 94966692ccdf..60450372c5d5 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -313,6 +313,9 @@ struct Qdisc_ops {
>  						     u32 block_index);
>  	void			(*egress_block_set)(struct Qdisc *sch,
>  						    u32 block_index);
> +	int			(*quirk_chk)(struct Qdisc *sch,
> +					     struct nlattr *arg,
> +					     struct netlink_ext_ack *extack);
>  	u32			(*ingress_block_get)(struct Qdisc *sch);
>  	u32			(*egress_block_get)(struct Qdisc *sch);
>  
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index f56b18c8aebf..a850df437691 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1315,6 +1315,12 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
>  		rcu_assign_pointer(sch->stab, stab);
>  	}
>  
> +	if (ops->quirk_chk) {
> +		err = ops->quirk_chk(sch, tca[TCA_OPTIONS], extack);
> +		if (err != 0)
> +			goto err_out3;
> +	}
> +
>  	if (ops->init) {
>  		err = ops->init(sch, tca[TCA_OPTIONS], extack);
>  		if (err != 0)
> @@ -1378,6 +1384,12 @@ static int qdisc_change(struct Qdisc *sch, struct nlattr **tca,
>  			NL_SET_ERR_MSG(extack, "Change of blocks is not supported");
>  			return -EOPNOTSUPP;
>  		}
> +		if (sch->ops->quirk_chk) {
> +			err = sch->ops->quirk_chk(sch, tca[TCA_OPTIONS],
> +						  extack);
> +			if (err != 0)
> +				return err;
> +		}
>  		err = sch->ops->change(sch, tca[TCA_OPTIONS], extack);
>  		if (err)
>  			return err;
> diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> index eafc316ae319..ceece2ae37bc 100644
> --- a/net/sched/sch_netem.c
> +++ b/net/sched/sch_netem.c
> @@ -975,13 +975,27 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla,
>  
>  static const struct Qdisc_class_ops netem_class_ops;
>  
> -static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
> -			       struct netlink_ext_ack *extack)
> +static int netem_quirk_chk(struct Qdisc *sch, struct nlattr *opt,
> +			   struct netlink_ext_ack *extack)
>  {
> +	struct nlattr *tb[TCA_NETEM_MAX + 1];
> +	struct tc_netem_qopt *qopt;
>  	struct Qdisc *root, *q;
> +	struct net_device *dev;
> +	bool root_is_mq;
> +	bool duplicates;
>  	unsigned int i;
> +	int ret;
> +
> +	ret = parse_attr(tb, TCA_NETEM_MAX, opt, netem_policy, sizeof(*qopt));
> +	if (ret < 0)
> +		return ret;
>  
> -	root = qdisc_root_sleeping(sch);
> +	qopt = nla_data(opt);
> +	duplicates = qopt->duplicate;
> +
> +	dev = sch->dev_queue->dev;
> +	root = rtnl_dereference(dev->qdisc);
>  
>  	if (sch != root && root->ops->cl_ops == &netem_class_ops) {
>  		if (duplicates ||
> @@ -992,19 +1006,25 @@ static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
>  	if (!qdisc_dev(root))
>  		return 0;
>  
> +	root_is_mq = root->flags & TCQ_F_MQROOT;
> +

What about HTB or other inherently multi-q qdisc?
Using netem on HTB on some branches is common practice.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op
  2025-11-25 22:46 [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op Victor Nogueira
  2025-11-26  3:11 ` Stephen Hemminger
@ 2025-11-26  4:02 ` Cong Wang
  2025-11-26  8:16 ` [syzbot ci] " syzbot ci
  2 siblings, 0 replies; 6+ messages in thread
From: Cong Wang @ 2025-11-26  4:02 UTC (permalink / raw)
  To: Victor Nogueira
  Cc: davem, kuba, edumazet, pabeni, jhs, jiri, horms, stephen, netdev

On Tue, Nov 25, 2025 at 07:46:04PM -0300, Victor Nogueira wrote:
> The goal of this patchset is to create an equivalent to PCI quirks, which
> will catch nonsensical hierarchies in #a and #c and reject such a config.

Sorry, we can't use another ugly code to justify adding more ugly code.

This is not anything personal, all hard-coded rules in kernel like this
are equally ugly. We should just revert the offending and ugly code,
instead of adding more ugly code to fix it.

BTW, I don't think this could fix it completely, it is just impossible
to know how many valid combinations this breaks.

Regards,
Cong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [syzbot ci] Re: net/sched: Introduce qdisc quirk_chk op
  2025-11-25 22:46 [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op Victor Nogueira
  2025-11-26  3:11 ` Stephen Hemminger
  2025-11-26  4:02 ` Cong Wang
@ 2025-11-26  8:16 ` syzbot ci
  2 siblings, 0 replies; 6+ messages in thread
From: syzbot ci @ 2025-11-26  8:16 UTC (permalink / raw)
  To: davem, edumazet, horms, jhs, jiri, kuba, netdev, pabeni, stephen,
	victor, xiyou.wangcong
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v2] net/sched: Introduce qdisc quirk_chk op
https://lore.kernel.org/all/20251125224604.872351-1-victor@mojatatu.com
* [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op

and found the following issue:
general protection fault in netem_quirk_chk

Full report is available here:
https://ci.syzbot.org/series/aa7207e5-00c6-4c9d-b330-9b2041104daf

***

general protection fault in netem_quirk_chk

tree:      net-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base:      e2c20036a8879476c88002730d8a27f4e3c32d4b
arch:      amd64
compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config:    https://ci.syzbot.org/builds/bbda2e20-2099-41bd-a7d4-c298c61ed2f5/config
C repro:   https://ci.syzbot.org/findings/21f8d5f6-c5af-4e81-a8d5-56f5febd6f03/c_repro
syz repro: https://ci.syzbot.org/findings/21f8d5f6-c5af-4e81-a8d5-56f5febd6f03/syz_repro

netlink: 28 bytes leftover after parsing attributes in process `syz.0.17'.
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 5961 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:nla_len include/net/netlink.h:1296 [inline]
RIP: 0010:parse_attr net/sched/sch_netem.c:960 [inline]
RIP: 0010:netem_quirk_chk+0x8a/0x740 net/sched/sch_netem.c:990
Code: 7c 24 60 49 c1 ef 03 43 c7 04 27 f1 f1 f1 f1 43 c7 44 27 13 f3 f3 f3 f3 43 c6 44 27 17 f3 e8 8d 68 67 f8 48 89 d8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 32 06 00 00 0f b7 03 83 c0 fc 44 0f b7
RSP: 0018:ffffc900032c7140 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88810ff73a00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881b82ce000
RBP: ffffc900032c72b8 R08: ffff88810ff73a00 R09: 0000000000000002
R10: 00000000fffffff1 R11: ffffffff89589ab0 R12: dffffc0000000000
R13: ffffffff89589ab0 R14: ffffffff8f7d9580 R15: 1ffff92000658e34
FS:  000055558efa7500(0000) GS:ffff8882a9f35000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000000100 CR3: 0000000114760000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 qdisc_create+0x73f/0xf10 net/sched/sch_api.c:1319
 __tc_modify_qdisc net/sched/sch_api.c:1765 [inline]
 tc_modify_qdisc+0x1582/0x2140 net/sched/sch_api.c:1829
 rtnetlink_rcv_msg+0x77c/0xb70 net/core/rtnetlink.c:6967
 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2550
 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
 netlink_unicast+0x82f/0x9e0 net/netlink/af_netlink.c:1344
 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1894
 sock_sendmsg_nosec net/socket.c:727 [inline]
 __sock_sendmsg+0x21c/0x270 net/socket.c:742
 ____sys_sendmsg+0x505/0x830 net/socket.c:2630
 ___sys_sendmsg+0x21f/0x2a0 net/socket.c:2684
 __sys_sendmsg net/socket.c:2716 [inline]
 __do_sys_sendmsg net/socket.c:2721 [inline]
 __se_sys_sendmsg net/socket.c:2719 [inline]
 __x64_sys_sendmsg+0x19b/0x260 net/socket.c:2719
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f1beb98f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd1afcc778 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f1bebbe5fa0 RCX: 00007f1beb98f749
RDX: 0000000000000000 RSI: 0000200000000000 RDI: 0000000000000003
RBP: 00007f1beba13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f1bebbe5fa0 R14: 00007f1bebbe5fa0 R15: 0000000000000003
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:nla_len include/net/netlink.h:1296 [inline]
RIP: 0010:parse_attr net/sched/sch_netem.c:960 [inline]
RIP: 0010:netem_quirk_chk+0x8a/0x740 net/sched/sch_netem.c:990
Code: 7c 24 60 49 c1 ef 03 43 c7 04 27 f1 f1 f1 f1 43 c7 44 27 13 f3 f3 f3 f3 43 c6 44 27 17 f3 e8 8d 68 67 f8 48 89 d8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 32 06 00 00 0f b7 03 83 c0 fc 44 0f b7
RSP: 0018:ffffc900032c7140 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88810ff73a00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8881b82ce000
RBP: ffffc900032c72b8 R08: ffff88810ff73a00 R09: 0000000000000002
R10: 00000000fffffff1 R11: ffffffff89589ab0 R12: dffffc0000000000
R13: ffffffff89589ab0 R14: ffffffff8f7d9580 R15: 1ffff92000658e34
FS:  000055558efa7500(0000) GS:ffff8882a9f35000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000200000000100 CR3: 0000000114760000 CR4: 00000000000006f0
----------------
Code disassembly (best guess), 1 bytes skipped:
   0:	24 60                	and    $0x60,%al
   2:	49 c1 ef 03          	shr    $0x3,%r15
   6:	43 c7 04 27 f1 f1 f1 	movl   $0xf1f1f1f1,(%r15,%r12,1)
   d:	f1
   e:	43 c7 44 27 13 f3 f3 	movl   $0xf3f3f3f3,0x13(%r15,%r12,1)
  15:	f3 f3
  17:	43 c6 44 27 17 f3    	movb   $0xf3,0x17(%r15,%r12,1)
  1d:	e8 8d 68 67 f8       	call   0xf86768af
  22:	48 89 d8             	mov    %rbx,%rax
  25:	48 c1 e8 03          	shr    $0x3,%rax
* 29:	42 0f b6 04 20       	movzbl (%rax,%r12,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	0f 85 32 06 00 00    	jne    0x668
  36:	0f b7 03             	movzwl (%rbx),%eax
  39:	83 c0 fc             	add    $0xfffffffc,%eax
  3c:	44                   	rex.R
  3d:	0f                   	.byte 0xf
  3e:	b7                   	.byte 0xb7


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op
  2025-11-26  3:11 ` Stephen Hemminger
@ 2025-11-26 16:29   ` Jamal Hadi Salim
  2025-11-26 20:31     ` Victor Nogueira
  0 siblings, 1 reply; 6+ messages in thread
From: Jamal Hadi Salim @ 2025-11-26 16:29 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Victor Nogueira, davem, kuba, edumazet, pabeni, jiri,
	xiyou.wangcong, horms, netdev

On Tue, Nov 25, 2025 at 10:11 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Tue, 25 Nov 2025 19:46:04 -0300
> Victor Nogueira <victor@mojatatu.com> wrote:
>
> > There is a pattern of bugs that end up creating UAFs or null ptr derefs.
> > The majority of these bugs follow the formula below:
> > a) create a nonsense hierarchy of qdiscs which has no practical value,
> > b) start sending packets
> > Optional c) netlink cmds to change hierarchy some more; It's more fun if
> > you can get packets stuck - the formula in this case includes non
> > work-conserving qdiscs somewhere in the hierarchy
> > Optional d dependent on c) send more packets
> > e) profit
> >
> > Current init/change qdisc APIs are localised to validate only within the
> > constraint of a single qdisc. So catching #a or #c is a challenge. Our
> > policy, when said bugs are presented, is to "make it work" by modifying
> > generally used data structures and code, but these come at the expense of
> > adding special checks for corner cases which are nonsensical to begin with.
> >
> > The goal of this patchset is to create an equivalent to PCI quirks, which
> > will catch nonsensical hierarchies in #a and #c and reject such a config.
> >
> > With that in mind, we are proposing the addition of a new qdisc op
> > (quirk_chk). We introduce, as a first example, the quirk_chk op to netem.
> > Its purpose here is to validate whether the user is attempting to add 2
> > netem duplicates in the same qdisc tree which will be forbidden unless
> > the root qdisc is multiqueue.
> >
> > Here is an example that should now work:
> >
> > DEV="eth0"
> > NUM_QUEUES=4
> > DUPLICATE_PERCENT="5%"
> >
> > tc qdisc del dev $DEV root > /dev/null 2>&1
> > tc qdisc add dev $DEV root handle 1: mq
> >
> > for i in $(seq 1 $NUM_QUEUES); do
> >     HANDLE_ID=$((i * 10))
> >     PARENT_ID="1:$i"
> >     tc qdisc add dev $DEV parent $PARENT_ID handle \
> >         ${HANDLE_ID}: netem duplicate $DUPLICATE_PERCENT
> > done
> >
> > Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
> > Signed-off-by: Victor Nogueira <victor@mojatatu.com>
> > ---
> > v1 -> v2:
> > - Simplify process of getting root qdisc in netem_quirk_chk
> > - Use parent's major directly instead of looking up parent qdisc in
> >   netem_quirk_chk
> > - Call parse_attrs in netem_quirk_chk to avoid issue caught by syzbot
> >
> > Link to v1:
> > https://lore.kernel.org/netdev/20251124223749.503979-1-victor@mojatatu.com/
> > ---
> >  include/net/sch_generic.h |  3 +++
> >  net/sched/sch_api.c       | 12 ++++++++++++
> >  net/sched/sch_netem.c     | 40 +++++++++++++++++++++++++++------------
> >  3 files changed, 43 insertions(+), 12 deletions(-)
> >
> > diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> > index 94966692ccdf..60450372c5d5 100644
> > --- a/include/net/sch_generic.h
> > +++ b/include/net/sch_generic.h
> > @@ -313,6 +313,9 @@ struct Qdisc_ops {
> >                                                    u32 block_index);
> >       void                    (*egress_block_set)(struct Qdisc *sch,
> >                                                   u32 block_index);
> > +     int                     (*quirk_chk)(struct Qdisc *sch,
> > +                                          struct nlattr *arg,
> > +                                          struct netlink_ext_ack *extack);
> >       u32                     (*ingress_block_get)(struct Qdisc *sch);
> >       u32                     (*egress_block_get)(struct Qdisc *sch);
> >
> > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> > index f56b18c8aebf..a850df437691 100644
> > --- a/net/sched/sch_api.c
> > +++ b/net/sched/sch_api.c
> > @@ -1315,6 +1315,12 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
> >               rcu_assign_pointer(sch->stab, stab);
> >       }
> >
> > +     if (ops->quirk_chk) {
> > +             err = ops->quirk_chk(sch, tca[TCA_OPTIONS], extack);
> > +             if (err != 0)
> > +                     goto err_out3;
> > +     }
> > +
> >       if (ops->init) {
> >               err = ops->init(sch, tca[TCA_OPTIONS], extack);
> >               if (err != 0)
> > @@ -1378,6 +1384,12 @@ static int qdisc_change(struct Qdisc *sch, struct nlattr **tca,
> >                       NL_SET_ERR_MSG(extack, "Change of blocks is not supported");
> >                       return -EOPNOTSUPP;
> >               }
> > +             if (sch->ops->quirk_chk) {
> > +                     err = sch->ops->quirk_chk(sch, tca[TCA_OPTIONS],
> > +                                               extack);
> > +                     if (err != 0)
> > +                             return err;
> > +             }
> >               err = sch->ops->change(sch, tca[TCA_OPTIONS], extack);
> >               if (err)
> >                       return err;
> > diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> > index eafc316ae319..ceece2ae37bc 100644
> > --- a/net/sched/sch_netem.c
> > +++ b/net/sched/sch_netem.c
> > @@ -975,13 +975,27 @@ static int parse_attr(struct nlattr *tb[], int maxtype, struct nlattr *nla,
> >
> >  static const struct Qdisc_class_ops netem_class_ops;
> >
> > -static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
> > -                            struct netlink_ext_ack *extack)
> > +static int netem_quirk_chk(struct Qdisc *sch, struct nlattr *opt,
> > +                        struct netlink_ext_ack *extack)
> >  {
> > +     struct nlattr *tb[TCA_NETEM_MAX + 1];
> > +     struct tc_netem_qopt *qopt;
> >       struct Qdisc *root, *q;
> > +     struct net_device *dev;
> > +     bool root_is_mq;
> > +     bool duplicates;
> >       unsigned int i;
> > +     int ret;
> > +
> > +     ret = parse_attr(tb, TCA_NETEM_MAX, opt, netem_policy, sizeof(*qopt));
> > +     if (ret < 0)
> > +             return ret;
> >
> > -     root = qdisc_root_sleeping(sch);
> > +     qopt = nla_data(opt);
> > +     duplicates = qopt->duplicate;
> > +
> > +     dev = sch->dev_queue->dev;
> > +     root = rtnl_dereference(dev->qdisc);
> >
> >       if (sch != root && root->ops->cl_ops == &netem_class_ops) {
> >               if (duplicates ||
> > @@ -992,19 +1006,25 @@ static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
> >       if (!qdisc_dev(root))
> >               return 0;
> >
> > +     root_is_mq = root->flags & TCQ_F_MQROOT;
> > +
>
> What about HTB or other inherently multi-q qdisc?
> Using netem on HTB on some branches is common practice.

Agreed - this should cover all classful qdiscs. I think the check
comes from the earlier discussion which involves offloadable qdiscs
that have multiple children (where HTB fits as well).
@Victor Nogueira how about check for nested netems with duplicates?

cheers,
jamal

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op
  2025-11-26 16:29   ` Jamal Hadi Salim
@ 2025-11-26 20:31     ` Victor Nogueira
  0 siblings, 0 replies; 6+ messages in thread
From: Victor Nogueira @ 2025-11-26 20:31 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Stephen Hemminger, davem, kuba, edumazet, pabeni, jiri,
	xiyou.wangcong, horms, netdev

On Wed, Nov 26, 2025 at 1:29 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
>
> On Tue, Nov 25, 2025 at 10:11 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > > [...]
> > > -     root = qdisc_root_sleeping(sch);
> > > +     qopt = nla_data(opt);
> > > +     duplicates = qopt->duplicate;
> > > +
> > > +     dev = sch->dev_queue->dev;
> > > +     root = rtnl_dereference(dev->qdisc);
> > >
> > >       if (sch != root && root->ops->cl_ops == &netem_class_ops) {
> > >               if (duplicates ||
> > > @@ -992,19 +1006,25 @@ static int check_netem_in_tree(struct Qdisc *sch, bool duplicates,
> > >       if (!qdisc_dev(root))
> > >               return 0;
> > >
> > > +     root_is_mq = root->flags & TCQ_F_MQROOT;
> > > +
> >
> > What about HTB or other inherently multi-q qdisc?
> > Using netem on HTB on some branches is common practice.
>
> Agreed - this should cover all classful qdiscs. I think the check
> comes from the earlier discussion which involves offloadable qdiscs
> that have multiple children (where HTB fits as well).
> @Victor Nogueira how about check for nested netems with duplicates?

Ok, will look at that approach for the next update

cheers,
Victor

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-11-26 20:31 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-25 22:46 [RFC PATCH net-next v2] net/sched: Introduce qdisc quirk_chk op Victor Nogueira
2025-11-26  3:11 ` Stephen Hemminger
2025-11-26 16:29   ` Jamal Hadi Salim
2025-11-26 20:31     ` Victor Nogueira
2025-11-26  4:02 ` Cong Wang
2025-11-26  8:16 ` [syzbot ci] " syzbot ci

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).