* [PATCH net 0/3] net_sched: sch_sfq: reject a derived limit of 1
@ 2025-03-28 20:16 Octavian Purdila
2025-03-28 20:16 ` [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration Octavian Purdila
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Octavian Purdila @ 2025-03-28 20:16 UTC (permalink / raw)
To: jhs, xiyou.wangcong, jiri
Cc: davem, edumazet, kuba, pabeni, horms, shuah, netdev,
Octavian Purdila
Because sfq parameters can influence each other there can be
situations where although the user sets a limit of 2 it can be lowered
to 1:
$ tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 depth 1
$ tc qdisc show dev dummy0
qdisc sfq 1: dev dummy0 root refcnt 2 limit 1p quantum 1514b depth 1 divisor 1024
$ tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 10 depth 1 divisor 1
$ tc qdisc show dev dummy0
qdisc sfq 2: root refcnt 2 limit 1p quantum 1514b depth 1 divisor 1
As a limit of 1 is invalid, this patch series moves the limit
validation to after all configuration changes have been done. To do
so, the configuration is done in a temporary work area then applied to
the internal state.
The patch series also adds new test cases.
Octavian Purdila (3):
net_sched: sch_sfq: use a temporary work area for validating
configuration
net_sched: sch_sfq: move the limit validation
selftests/tc-testing: sfq: check that a derived limit of 1 is rejected
net/sched/sch_sfq.c | 70 ++++++++++++++-----
.../tc-testing/tc-tests/qdiscs/sfq.json | 36 ++++++++++
2 files changed, 90 insertions(+), 16 deletions(-)
--
2.49.0.472.ge94155a9ec-goog
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration
2025-03-28 20:16 [PATCH net 0/3] net_sched: sch_sfq: reject a derived limit of 1 Octavian Purdila
@ 2025-03-28 20:16 ` Octavian Purdila
2025-03-30 23:49 ` Cong Wang
2025-03-28 20:16 ` [PATCH net 2/3] net_sched: sch_sfq: move the limit validation Octavian Purdila
2025-03-28 20:16 ` [PATCH net 3/3] selftests/tc-testing: sfq: check that a derived limit of 1 is rejected Octavian Purdila
2 siblings, 1 reply; 9+ messages in thread
From: Octavian Purdila @ 2025-03-28 20:16 UTC (permalink / raw)
To: jhs, xiyou.wangcong, jiri
Cc: davem, edumazet, kuba, pabeni, horms, shuah, netdev,
Octavian Purdila
Many configuration parameters have influence on others (e.g. divisor
-> flows -> limit, depth -> limit) and so it is difficult to correctly
do all of the validation before applying the configuration. And if a
validation error is detected late it is difficult to roll back a
partially applied configuration.
To avoid these issues use a temporary work area to update and validate
the configuration and only then apply the configuration to the
internal state.
Signed-off-by: Octavian Purdila <tavip@google.com>
---
net/sched/sch_sfq.c | 60 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 48 insertions(+), 12 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 65d5b59da583..027a3fde2139 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -631,6 +631,18 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
struct red_parms *p = NULL;
struct sk_buff *to_free = NULL;
struct sk_buff *tail = NULL;
+ /* work area for validating changes before committing them */
+ struct {
+ int limit;
+ unsigned int divisor;
+ unsigned int maxflows;
+ int perturb_period;
+ unsigned int quantum;
+ u8 headdrop;
+ u8 maxdepth;
+ u8 flags;
+ } tmp;
+
if (opt->nla_len < nla_attr_size(sizeof(*ctl)))
return -EINVAL;
@@ -656,36 +668,60 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
NL_SET_ERR_MSG_MOD(extack, "invalid limit");
return -EINVAL;
}
+
sch_tree_lock(sch);
+
+ /* copy configuration to work area */
+ tmp.limit = q->limit;
+ tmp.divisor = q->divisor;
+ tmp.headdrop = q->headdrop;
+ tmp.maxdepth = q->maxdepth;
+ tmp.maxflows = q->maxflows;
+ tmp.perturb_period = q->perturb_period;
+ tmp.quantum = q->quantum;
+ tmp.flags = q->flags;
+
+ /* update and validate configuration */
if (ctl->quantum)
- q->quantum = ctl->quantum;
- WRITE_ONCE(q->perturb_period, ctl->perturb_period * HZ);
+ tmp.quantum = ctl->quantum;
+ tmp.perturb_period = ctl->perturb_period * HZ;
if (ctl->flows)
- q->maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS);
+ tmp.maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS);
if (ctl->divisor) {
- q->divisor = ctl->divisor;
- q->maxflows = min_t(u32, q->maxflows, q->divisor);
+ tmp.divisor = ctl->divisor;
+ tmp.maxflows = min_t(u32, tmp.maxflows, tmp.divisor);
}
if (ctl_v1) {
if (ctl_v1->depth)
- q->maxdepth = min_t(u32, ctl_v1->depth, SFQ_MAX_DEPTH);
+ tmp.maxdepth = min_t(u32, ctl_v1->depth, SFQ_MAX_DEPTH);
if (p) {
- swap(q->red_parms, p);
- red_set_parms(q->red_parms,
+ red_set_parms(p,
ctl_v1->qth_min, ctl_v1->qth_max,
ctl_v1->Wlog,
ctl_v1->Plog, ctl_v1->Scell_log,
NULL,
ctl_v1->max_P);
}
- q->flags = ctl_v1->flags;
- q->headdrop = ctl_v1->headdrop;
+ tmp.flags = ctl_v1->flags;
+ tmp.headdrop = ctl_v1->headdrop;
}
if (ctl->limit) {
- q->limit = min_t(u32, ctl->limit, q->maxdepth * q->maxflows);
- q->maxflows = min_t(u32, q->maxflows, q->limit);
+ tmp.limit = min_t(u32, ctl->limit, tmp.maxdepth * tmp.maxflows);
+ tmp.maxflows = min_t(u32, tmp.maxflows, tmp.limit);
}
+ /* commit configuration, no return from this point further */
+ q->limit = tmp.limit;
+ q->divisor = tmp.divisor;
+ q->headdrop = tmp.headdrop;
+ q->maxdepth = tmp.maxdepth;
+ q->maxflows = tmp.maxflows;
+ WRITE_ONCE(q->perturb_period, tmp.perturb_period);
+ q->quantum = tmp.quantum;
+ q->flags = tmp.flags;
+ if (p)
+ swap(q->red_parms, p);
+
qlen = sch->q.qlen;
while (sch->q.qlen > q->limit) {
dropped += sfq_drop(sch, &to_free);
--
2.49.0.472.ge94155a9ec-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 2/3] net_sched: sch_sfq: move the limit validation
2025-03-28 20:16 [PATCH net 0/3] net_sched: sch_sfq: reject a derived limit of 1 Octavian Purdila
2025-03-28 20:16 ` [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration Octavian Purdila
@ 2025-03-28 20:16 ` Octavian Purdila
2025-03-28 20:16 ` [PATCH net 3/3] selftests/tc-testing: sfq: check that a derived limit of 1 is rejected Octavian Purdila
2 siblings, 0 replies; 9+ messages in thread
From: Octavian Purdila @ 2025-03-28 20:16 UTC (permalink / raw)
To: jhs, xiyou.wangcong, jiri
Cc: davem, edumazet, kuba, pabeni, horms, shuah, netdev,
Octavian Purdila, syzbot
It is not sufficient to directly validate the limit on the data that
the user passes as it can be updated based on how the other parameters
are changed.
Move the check at the end of the configuration update process to also
catch scenarios where the limit is indirectly updated, for example
with the following configurations:
tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 depth 1
tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 divisor 1
This fixes the following syzkaller reported crash:
------------[ cut here ]------------
UBSAN: array-index-out-of-bounds in net/sched/sch_sfq.c:203:6
index 65535 is out of range for type 'struct sfq_head[128]'
CPU: 1 UID: 0 PID: 3037 Comm: syz.2.16 Not tainted 6.14.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x201/0x300 lib/dump_stack.c:120
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_out_of_bounds+0xf5/0x120 lib/ubsan.c:429
sfq_link net/sched/sch_sfq.c:203 [inline]
sfq_dec+0x53c/0x610 net/sched/sch_sfq.c:231
sfq_dequeue+0x34e/0x8c0 net/sched/sch_sfq.c:493
sfq_reset+0x17/0x60 net/sched/sch_sfq.c:518
qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035
tbf_reset+0x41/0x110 net/sched/sch_tbf.c:339
qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035
dev_reset_queue+0x100/0x1b0 net/sched/sch_generic.c:1311
netdev_for_each_tx_queue include/linux/netdevice.h:2590 [inline]
dev_deactivate_many+0x7e5/0xe70 net/sched/sch_generic.c:1375
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: 10685681bafc ("net_sched: sch_sfq: don't allow 1 packet limit")
Signed-off-by: Octavian Purdila <tavip@google.com>
---
net/sched/sch_sfq.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 027a3fde2139..a9f20cc98a1a 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -664,10 +664,6 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
if (!p)
return -ENOMEM;
}
- if (ctl->limit == 1) {
- NL_SET_ERR_MSG_MOD(extack, "invalid limit");
- return -EINVAL;
- }
sch_tree_lock(sch);
@@ -709,6 +705,12 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
tmp.limit = min_t(u32, ctl->limit, tmp.maxdepth * tmp.maxflows);
tmp.maxflows = min_t(u32, tmp.maxflows, tmp.limit);
}
+ if (tmp.limit == 1) {
+ sch_tree_unlock(sch);
+ kfree(p);
+ NL_SET_ERR_MSG_MOD(extack, "invalid limit");
+ return -EINVAL;
+ }
/* commit configuration, no return from this point further */
q->limit = tmp.limit;
--
2.49.0.472.ge94155a9ec-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH net 3/3] selftests/tc-testing: sfq: check that a derived limit of 1 is rejected
2025-03-28 20:16 [PATCH net 0/3] net_sched: sch_sfq: reject a derived limit of 1 Octavian Purdila
2025-03-28 20:16 ` [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration Octavian Purdila
2025-03-28 20:16 ` [PATCH net 2/3] net_sched: sch_sfq: move the limit validation Octavian Purdila
@ 2025-03-28 20:16 ` Octavian Purdila
2 siblings, 0 replies; 9+ messages in thread
From: Octavian Purdila @ 2025-03-28 20:16 UTC (permalink / raw)
To: jhs, xiyou.wangcong, jiri
Cc: davem, edumazet, kuba, pabeni, horms, shuah, netdev,
Octavian Purdila
Because the limit is updated indirectly when other parameters are
updated, there are cases where even though the user requests a limit
of 2 it can actually be set to 1.
Add the following test cases to check that the kernel rejects them:
- limit 2 depth 1 flows 1
- limit 2 depth 1 divisor 1
Signed-off-by: Octavian Purdila <tavip@google.com>
---
.../tc-testing/tc-tests/qdiscs/sfq.json | 36 +++++++++++++++++++
1 file changed, 36 insertions(+)
diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/sfq.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/sfq.json
index 50e8d72781cb..28c6ce6da7db 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/sfq.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/sfq.json
@@ -228,5 +228,41 @@
"matchCount": "0",
"teardown": [
]
+ },
+ {
+ "id": "7f8f",
+ "name": "Check that a derived limit of 1 is rejected (limit 2 depth 1 flows 1)",
+ "category": [
+ "qdisc",
+ "sfq"
+ ],
+ "plugins": {
+ "requires": "nsPlugin"
+ },
+ "setup": [],
+ "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root sfq limit 2 depth 1 flows 1",
+ "expExitCode": "2",
+ "verifyCmd": "$TC qdisc show dev $DUMMY",
+ "matchPattern": "sfq",
+ "matchCount": "0",
+ "teardown": []
+ },
+ {
+ "id": "5168",
+ "name": "Check that a derived limit of 1 is rejected (limit 2 depth 1 divisor 1)",
+ "category": [
+ "qdisc",
+ "sfq"
+ ],
+ "plugins": {
+ "requires": "nsPlugin"
+ },
+ "setup": [],
+ "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root sfq limit 2 depth 1 divisor 1",
+ "expExitCode": "2",
+ "verifyCmd": "$TC qdisc show dev $DUMMY",
+ "matchPattern": "sfq",
+ "matchCount": "0",
+ "teardown": []
}
]
--
2.49.0.472.ge94155a9ec-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration
2025-03-28 20:16 ` [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration Octavian Purdila
@ 2025-03-30 23:49 ` Cong Wang
2025-04-01 9:26 ` Paolo Abeni
0 siblings, 1 reply; 9+ messages in thread
From: Cong Wang @ 2025-03-30 23:49 UTC (permalink / raw)
To: Octavian Purdila
Cc: jhs, jiri, davem, edumazet, kuba, pabeni, horms, shuah, netdev
On Fri, Mar 28, 2025 at 01:16:32PM -0700, Octavian Purdila wrote:
> Many configuration parameters have influence on others (e.g. divisor
> -> flows -> limit, depth -> limit) and so it is difficult to correctly
> do all of the validation before applying the configuration. And if a
> validation error is detected late it is difficult to roll back a
> partially applied configuration.
>
> To avoid these issues use a temporary work area to update and validate
> the configuration and only then apply the configuration to the
> internal state.
>
> Signed-off-by: Octavian Purdila <tavip@google.com>
> ---
> net/sched/sch_sfq.c | 60 ++++++++++++++++++++++++++++++++++++---------
> 1 file changed, 48 insertions(+), 12 deletions(-)
>
> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> index 65d5b59da583..027a3fde2139 100644
> --- a/net/sched/sch_sfq.c
> +++ b/net/sched/sch_sfq.c
> @@ -631,6 +631,18 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
> struct red_parms *p = NULL;
> struct sk_buff *to_free = NULL;
> struct sk_buff *tail = NULL;
> + /* work area for validating changes before committing them */
> + struct {
> + int limit;
> + unsigned int divisor;
> + unsigned int maxflows;
> + int perturb_period;
> + unsigned int quantum;
> + u8 headdrop;
> + u8 maxdepth;
> + u8 flags;
> + } tmp;
Thanks for your patch. It reminds me again about the lacking of complete
RCU support in TC. ;-)
Instead of using a temporary struct, how about introducing a new one
called struct sfq_sched_opt and putting it inside struct sfq_sched_data?
It looks more elegant to me.
Regards,
Cong
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration
2025-03-30 23:49 ` Cong Wang
@ 2025-04-01 9:26 ` Paolo Abeni
2025-04-01 10:47 ` Eric Dumazet
0 siblings, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2025-04-01 9:26 UTC (permalink / raw)
To: Cong Wang, Octavian Purdila
Cc: jhs, jiri, davem, edumazet, kuba, horms, shuah, netdev
On 3/31/25 1:49 AM, Cong Wang wrote:
> On Fri, Mar 28, 2025 at 01:16:32PM -0700, Octavian Purdila wrote:
>> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
>> index 65d5b59da583..027a3fde2139 100644
>> --- a/net/sched/sch_sfq.c
>> +++ b/net/sched/sch_sfq.c
>> @@ -631,6 +631,18 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
>> struct red_parms *p = NULL;
>> struct sk_buff *to_free = NULL;
>> struct sk_buff *tail = NULL;
>> + /* work area for validating changes before committing them */
>> + struct {
>> + int limit;
>> + unsigned int divisor;
>> + unsigned int maxflows;
>> + int perturb_period;
>> + unsigned int quantum;
>> + u8 headdrop;
>> + u8 maxdepth;
>> + u8 flags;
>> + } tmp;
>
> Thanks for your patch. It reminds me again about the lacking of complete
> RCU support in TC. ;-)
>
> Instead of using a temporary struct, how about introducing a new one
> called struct sfq_sched_opt and putting it inside struct sfq_sched_data?
> It looks more elegant to me.
I agree with that. It should also make the code more compact. @Octavian,
please update the patch as per Cong's suggestion.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration
2025-04-01 9:26 ` Paolo Abeni
@ 2025-04-01 10:47 ` Eric Dumazet
2025-04-01 11:13 ` Paolo Abeni
0 siblings, 1 reply; 9+ messages in thread
From: Eric Dumazet @ 2025-04-01 10:47 UTC (permalink / raw)
To: Paolo Abeni
Cc: Cong Wang, Octavian Purdila, jhs, jiri, davem, kuba, horms, shuah,
netdev
On Tue, Apr 1, 2025 at 11:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 3/31/25 1:49 AM, Cong Wang wrote:
> > On Fri, Mar 28, 2025 at 01:16:32PM -0700, Octavian Purdila wrote:
> >> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> >> index 65d5b59da583..027a3fde2139 100644
> >> --- a/net/sched/sch_sfq.c
> >> +++ b/net/sched/sch_sfq.c
> >> @@ -631,6 +631,18 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
> >> struct red_parms *p = NULL;
> >> struct sk_buff *to_free = NULL;
> >> struct sk_buff *tail = NULL;
> >> + /* work area for validating changes before committing them */
> >> + struct {
> >> + int limit;
> >> + unsigned int divisor;
> >> + unsigned int maxflows;
> >> + int perturb_period;
> >> + unsigned int quantum;
> >> + u8 headdrop;
> >> + u8 maxdepth;
> >> + u8 flags;
> >> + } tmp;
> >
> > Thanks for your patch. It reminds me again about the lacking of complete
> > RCU support in TC. ;-)
> >
> > Instead of using a temporary struct, how about introducing a new one
> > called struct sfq_sched_opt and putting it inside struct sfq_sched_data?
> > It looks more elegant to me.
>
> I agree with that. It should also make the code more compact. @Octavian,
> please update the patch as per Cong's suggestion.
The concern with this approach was data locality.
I had in my TODO list a patch to remove (accumulated over time) holes
and put together hot fields.
Something like :
struct sfq_sched_data {
int limit; /* 0 0x4 */
unsigned int divisor; /* 0x4 0x4 */
u8 headdrop; /* 0x8 0x1 */
u8 maxdepth; /* 0x9 0x1 */
u8 cur_depth; /* 0xa 0x1 */
u8 flags; /* 0xb 0x1 */
unsigned int quantum; /* 0xc 0x4 */
siphash_key_t perturbation; /* 0x10 0x10 */
struct tcf_proto * filter_list; /* 0x20 0x8 */
struct tcf_block * block; /* 0x28 0x8 */
sfq_index * ht; /* 0x30 0x8 */
struct sfq_slot * slots; /* 0x38 0x8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct red_parms * red_parms; /* 0x40 0x8 */
struct tc_sfqred_stats stats; /* 0x48 0x18 */
struct sfq_slot * tail; /* 0x60 0x8 */
struct sfq_head dep[128]; /* 0x68 0x200 */
/* --- cacheline 9 boundary (576 bytes) was 40 bytes ago --- */
unsigned int maxflows; /* 0x268 0x4 */
int perturb_period; /* 0x26c 0x4 */
struct timer_list perturb_timer; /* 0x270 0x28 */
/* XXX last struct has 4 bytes of padding */
/* --- cacheline 10 boundary (640 bytes) was 24 bytes ago --- */
struct Qdisc * sch; /* 0x298 0x8 */
/* size: 672, cachelines: 11, members: 20 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 32 bytes */
};
With this patch :
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 65d5b59da583..f8fec2bc0d25 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -110,10 +110,11 @@ struct sfq_sched_data {
unsigned int divisor; /* number of slots in hash table */
u8 headdrop;
u8 maxdepth; /* limit of packets per flow */
-
- siphash_key_t perturbation;
u8 cur_depth; /* depth of longest slot */
u8 flags;
+ unsigned int quantum; /* Allotment per round: MUST
BE >= MTU */
+
+ siphash_key_t perturbation;
struct tcf_proto __rcu *filter_list;
struct tcf_block *block;
sfq_index *ht; /* Hash table ('divisor' slots) */
@@ -132,7 +133,6 @@ struct sfq_sched_data {
unsigned int maxflows; /* number of flows in flows array */
int perturb_period;
- unsigned int quantum; /* Allotment per round: MUST
BE >= MTU */
struct timer_list perturb_timer;
struct Qdisc *sch;
};
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration
2025-04-01 10:47 ` Eric Dumazet
@ 2025-04-01 11:13 ` Paolo Abeni
2025-04-01 16:36 ` Octavian Purdila
0 siblings, 1 reply; 9+ messages in thread
From: Paolo Abeni @ 2025-04-01 11:13 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Octavian Purdila, jhs, jiri, davem, kuba, horms, shuah,
netdev
On 4/1/25 12:47 PM, Eric Dumazet wrote:
> On Tue, Apr 1, 2025 at 11:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 3/31/25 1:49 AM, Cong Wang wrote:
>>> On Fri, Mar 28, 2025 at 01:16:32PM -0700, Octavian Purdila wrote:
>>>> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
>>>> index 65d5b59da583..027a3fde2139 100644
>>>> --- a/net/sched/sch_sfq.c
>>>> +++ b/net/sched/sch_sfq.c
>>>> @@ -631,6 +631,18 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
>>>> struct red_parms *p = NULL;
>>>> struct sk_buff *to_free = NULL;
>>>> struct sk_buff *tail = NULL;
>>>> + /* work area for validating changes before committing them */
>>>> + struct {
>>>> + int limit;
>>>> + unsigned int divisor;
>>>> + unsigned int maxflows;
>>>> + int perturb_period;
>>>> + unsigned int quantum;
>>>> + u8 headdrop;
>>>> + u8 maxdepth;
>>>> + u8 flags;
>>>> + } tmp;
>>>
>>> Thanks for your patch. It reminds me again about the lacking of complete
>>> RCU support in TC. ;-)
>>>
>>> Instead of using a temporary struct, how about introducing a new one
>>> called struct sfq_sched_opt and putting it inside struct sfq_sched_data?
>>> It looks more elegant to me.
>>
>> I agree with that. It should also make the code more compact. @Octavian,
>> please update the patch as per Cong's suggestion.
>
> The concern with this approach was data locality.
I did not consider that aspect.
How about not using the struct at all, then?
int cur_limit;
// ...
u8 cur_flags;
the 'tmp' struct is IMHO not so nice.
> I had in my TODO list a patch to remove (accumulated over time) holes
> and put together hot fields.
>
> Something like :
>
> struct sfq_sched_data {
> int limit; /* 0 0x4 */
> unsigned int divisor; /* 0x4 0x4 */
> u8 headdrop; /* 0x8 0x1 */
> u8 maxdepth; /* 0x9 0x1 */
> u8 cur_depth; /* 0xa 0x1 */
> u8 flags; /* 0xb 0x1 */
> unsigned int quantum; /* 0xc 0x4 */
> siphash_key_t perturbation; /* 0x10 0x10 */
> struct tcf_proto * filter_list; /* 0x20 0x8 */
> struct tcf_block * block; /* 0x28 0x8 */
> sfq_index * ht; /* 0x30 0x8 */
> struct sfq_slot * slots; /* 0x38 0x8 */
> /* --- cacheline 1 boundary (64 bytes) --- */
> struct red_parms * red_parms; /* 0x40 0x8 */
> struct tc_sfqred_stats stats; /* 0x48 0x18 */
> struct sfq_slot * tail; /* 0x60 0x8 */
> struct sfq_head dep[128]; /* 0x68 0x200 */
> /* --- cacheline 9 boundary (576 bytes) was 40 bytes ago --- */
> unsigned int maxflows; /* 0x268 0x4 */
> int perturb_period; /* 0x26c 0x4 */
> struct timer_list perturb_timer; /* 0x270 0x28 */
>
> /* XXX last struct has 4 bytes of padding */
>
> /* --- cacheline 10 boundary (640 bytes) was 24 bytes ago --- */
> struct Qdisc * sch; /* 0x298 0x8 */
>
> /* size: 672, cachelines: 11, members: 20 */
> /* paddings: 1, sum paddings: 4 */
> /* last cacheline: 32 bytes */
> };
>
>
> With this patch :
>
> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> index 65d5b59da583..f8fec2bc0d25 100644
> --- a/net/sched/sch_sfq.c
> +++ b/net/sched/sch_sfq.c
> @@ -110,10 +110,11 @@ struct sfq_sched_data {
> unsigned int divisor; /* number of slots in hash table */
> u8 headdrop;
> u8 maxdepth; /* limit of packets per flow */
> -
> - siphash_key_t perturbation;
> u8 cur_depth; /* depth of longest slot */
> u8 flags;
> + unsigned int quantum; /* Allotment per round: MUST
> BE >= MTU */
> +
> + siphash_key_t perturbation;
> struct tcf_proto __rcu *filter_list;
> struct tcf_block *block;
> sfq_index *ht; /* Hash table ('divisor' slots) */
> @@ -132,7 +133,6 @@ struct sfq_sched_data {
>
> unsigned int maxflows; /* number of flows in flows array */
> int perturb_period;
Would it make any sense to additionally move 'maxflows' and
'perturb_period' at the top, just after 'perturbation'?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration
2025-04-01 11:13 ` Paolo Abeni
@ 2025-04-01 16:36 ` Octavian Purdila
0 siblings, 0 replies; 9+ messages in thread
From: Octavian Purdila @ 2025-04-01 16:36 UTC (permalink / raw)
To: Paolo Abeni
Cc: Eric Dumazet, Cong Wang, jhs, jiri, davem, kuba, horms, shuah,
netdev
On Tue, Apr 1, 2025 at 4:13 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 4/1/25 12:47 PM, Eric Dumazet wrote:
> > On Tue, Apr 1, 2025 at 11:27 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >> On 3/31/25 1:49 AM, Cong Wang wrote:
> >>> On Fri, Mar 28, 2025 at 01:16:32PM -0700, Octavian Purdila wrote:
> >>>> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> >>>> index 65d5b59da583..027a3fde2139 100644
> >>>> --- a/net/sched/sch_sfq.c
> >>>> +++ b/net/sched/sch_sfq.c
> >>>> @@ -631,6 +631,18 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt,
> >>>> struct red_parms *p = NULL;
> >>>> struct sk_buff *to_free = NULL;
> >>>> struct sk_buff *tail = NULL;
> >>>> + /* work area for validating changes before committing them */
> >>>> + struct {
> >>>> + int limit;
> >>>> + unsigned int divisor;
> >>>> + unsigned int maxflows;
> >>>> + int perturb_period;
> >>>> + unsigned int quantum;
> >>>> + u8 headdrop;
> >>>> + u8 maxdepth;
> >>>> + u8 flags;
> >>>> + } tmp;
> >>>
> >>> Thanks for your patch. It reminds me again about the lacking of complete
> >>> RCU support in TC. ;-)
> >>>
> >>> Instead of using a temporary struct, how about introducing a new one
> >>> called struct sfq_sched_opt and putting it inside struct sfq_sched_data?
> >>> It looks more elegant to me.
> >>
> >> I agree with that. It should also make the code more compact. @Octavian,
> >> please update the patch as per Cong's suggestion.
> >
> > The concern with this approach was data locality.
>
> I did not consider that aspect.
>
> How about not using the struct at all, then?
>
> int cur_limit;
> // ...
> u8 cur_flags;
>
> the 'tmp' struct is IMHO not so nice.
>
Thanks Paolo and Cong for the review.
I'll drop the structure, my initial thoughts were that by grouping
them into a structure it would make it easier to understand the
purpose of those variables, especially since we have a lot of local
variables now.
> > I had in my TODO list a patch to remove (accumulated over time) holes
> > and put together hot fields.
> >
> > Something like :
> >
> > struct sfq_sched_data {
> > int limit; /* 0 0x4 */
> > unsigned int divisor; /* 0x4 0x4 */
> > u8 headdrop; /* 0x8 0x1 */
> > u8 maxdepth; /* 0x9 0x1 */
> > u8 cur_depth; /* 0xa 0x1 */
> > u8 flags; /* 0xb 0x1 */
> > unsigned int quantum; /* 0xc 0x4 */
> > siphash_key_t perturbation; /* 0x10 0x10 */
> > struct tcf_proto * filter_list; /* 0x20 0x8 */
> > struct tcf_block * block; /* 0x28 0x8 */
> > sfq_index * ht; /* 0x30 0x8 */
> > struct sfq_slot * slots; /* 0x38 0x8 */
> > /* --- cacheline 1 boundary (64 bytes) --- */
> > struct red_parms * red_parms; /* 0x40 0x8 */
> > struct tc_sfqred_stats stats; /* 0x48 0x18 */
> > struct sfq_slot * tail; /* 0x60 0x8 */
> > struct sfq_head dep[128]; /* 0x68 0x200 */
> > /* --- cacheline 9 boundary (576 bytes) was 40 bytes ago --- */
> > unsigned int maxflows; /* 0x268 0x4 */
> > int perturb_period; /* 0x26c 0x4 */
> > struct timer_list perturb_timer; /* 0x270 0x28 */
> >
> > /* XXX last struct has 4 bytes of padding */
> >
> > /* --- cacheline 10 boundary (640 bytes) was 24 bytes ago --- */
> > struct Qdisc * sch; /* 0x298 0x8 */
> >
> > /* size: 672, cachelines: 11, members: 20 */
> > /* paddings: 1, sum paddings: 4 */
> > /* last cacheline: 32 bytes */
> > };
> >
> >
> > With this patch :
> >
> > diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> > index 65d5b59da583..f8fec2bc0d25 100644
> > --- a/net/sched/sch_sfq.c
> > +++ b/net/sched/sch_sfq.c
> > @@ -110,10 +110,11 @@ struct sfq_sched_data {
> > unsigned int divisor; /* number of slots in hash table */
> > u8 headdrop;
> > u8 maxdepth; /* limit of packets per flow */
> > -
> > - siphash_key_t perturbation;
> > u8 cur_depth; /* depth of longest slot */
> > u8 flags;
> > + unsigned int quantum; /* Allotment per round: MUST
> > BE >= MTU */
> > +
> > + siphash_key_t perturbation;
> > struct tcf_proto __rcu *filter_list;
> > struct tcf_block *block;
> > sfq_index *ht; /* Hash table ('divisor' slots) */
> > @@ -132,7 +133,6 @@ struct sfq_sched_data {
> >
> > unsigned int maxflows; /* number of flows in flows array */
> > int perturb_period;
>
> Would it make any sense to additionally move 'maxflows' and
> 'perturb_period' at the top, just after 'perturbation'?
>
> Thanks,
>
> Paolo
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-04-01 16:36 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-28 20:16 [PATCH net 0/3] net_sched: sch_sfq: reject a derived limit of 1 Octavian Purdila
2025-03-28 20:16 ` [PATCH net 1/3] net_sched: sch_sfq: use a temporary work area for validating configuration Octavian Purdila
2025-03-30 23:49 ` Cong Wang
2025-04-01 9:26 ` Paolo Abeni
2025-04-01 10:47 ` Eric Dumazet
2025-04-01 11:13 ` Paolo Abeni
2025-04-01 16:36 ` Octavian Purdila
2025-03-28 20:16 ` [PATCH net 2/3] net_sched: sch_sfq: move the limit validation Octavian Purdila
2025-03-28 20:16 ` [PATCH net 3/3] selftests/tc-testing: sfq: check that a derived limit of 1 is rejected Octavian Purdila
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).