public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: netdev@vger.kernel.org, will@willsroot.io,
	stephen@networkplumber.org,
	Savino Dicanosa <savy@syst3mfailure.io>
Subject: Re: [Patch net 1/2] netem: Fix skb duplication logic to prevent infinite loops
Date: Mon, 7 Jul 2025 12:40:40 -0700	[thread overview]
Message-ID: <aGwiuDju8TNvRdGe@pop-os.localdomain> (raw)
In-Reply-To: <CAM0EoM=99ufQSzbYZU=wz8fbYOQ2v+cMa7BX1EM6OHk+dBrE0Q@mail.gmail.com>

On Sat, Jul 05, 2025 at 09:52:05AM -0400, Jamal Hadi Salim wrote:
> On Fri, Jul 4, 2025 at 8:48 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Wed, Jul 02, 2025 at 11:04:22AM -0400, Jamal Hadi Salim wrote:
> > > On Wed, Jul 2, 2025 at 10:12 AM Jamal Hadi Salim <jhs@mojatatu.com> wrote:
> > > >
> > > > On Tue, Jul 1, 2025 at 9:57 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jul 01, 2025 at 04:13:05PM -0700, Cong Wang wrote:
> > > > > > diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
> > > > > > index fdd79d3ccd8c..33de9c3e4d1b 100644
> > > > > > --- a/net/sched/sch_netem.c
> > > > > > +++ b/net/sched/sch_netem.c
> > > > > > @@ -460,7 +460,8 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> > > > > >       skb->prev = NULL;
> > > > > >
> > > > > >       /* Random duplication */
> > > > > > -     if (q->duplicate && q->duplicate >= get_crandom(&q->dup_cor, &q->prng))
> > > > > > +     if (tc_skb_cb(skb)->duplicate &&
> > > > >
> > > > > Oops, this is clearly should be !duplicate... It was lost during my
> > > > > stupid copy-n-paste... Sorry for this mistake.
> > > > >
> > > >
> > > > I understood you earlier, Cong. My view still stands:
> > > > You are adding logic to a common data structure for a use case that
> >
> > You are exaggerating this. I only added 1 bit to the core data structure,
> > the code logic remains in the netem, so it is contained within netem.
> 
> Try it out ;->
> Here's an even simpler setup:
> 
> sudo tc qdisc add dev lo root handle 1: prio bands 3 priomap 0 0 0 0 0
> 0 0 0 0 0 0 0 0 0 0 0
> sudo tc filter add dev lo parent 1:0 protocol ip bpf obj
> netem_bug_test.o sec classifier/pass classid 1:1
> sudo tc qdisc add dev lo parent 1:1 handle 10: netem limit 4 duplicate 100%
> then:
> ping -c 1 127.0.0.1

Of course (I replaced your ebpf filter with matchall):

[root@localhost ~]# cat netem_from_jamal.sh
tc qdisc add dev lo root handle 1: prio bands 3 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# tc filter add dev lo parent 1:0 protocol ip bpf obj netem_bug_test.o sec classifier/pass classid 1:1
tc filter add dev lo parent 1:0 protocol ip matchall classid 1:1
tc qdisc add dev lo parent 1:1 handle 10: netem limit 4 duplicate 100%

[root@localhost ~]# bash -x netem_from_jamal.sh
+ tc qdisc add dev lo root handle 1: prio bands 3 priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
+ tc filter add dev lo parent 1:0 protocol ip matchall classid 1:1
+ tc qdisc add dev lo parent 1:1 handle 10: netem limit 4 duplicate 100%
[root@localhost ~]# ping -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=3.84 ms

--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 3.836/3.836/3.836/0.000 ms

There is clearly no soft lockup. Hence the original issue has been successfully fixed.

> 
> Note: there are other issues as well but i thought citing the ebpf one
> was sufficient to get the point across.

Please kindly define "issues" here. My definition for issue in this
context is the soft lockup issue reported by William. Like I already
explained, I have _no_ intention to solve any other issue than the one
reported by William, simply because they probably can be deferred to
-net-next.

> 
> >
> > > > really makes no sense. The ROI is not good.
> >
> > Speaking of ROI, I think you need to look at the patch stats:
> >
> > William/Your patch:
> >  1 file changed, 40 insertions(+)
> >
> > My patch:
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> >
> 
> ROI is not just about LOC. The consequences of a patch are also part
> of that formula. And let's not forget the time spent so far debating
> instead of plugging the hole.

LOC matters a lot for code review and maintainance.

> 
> >
> > > > BTW: I am almost certain you will hit other issues when this goes out
> > > > or when you actually start to test and then you will have to fix more
> > > > spots.
> > > >
> > > Here's an example that breaks it:
> > >
> > > sudo tc qdisc add dev lo root handle 1: prio bands 3 priomap 0 0 0 0 0
> > > 0 0 0 0 0 0 0 0 0 0 0
> > > sudo tc filter add dev lo parent 1:0 protocol ip bpf obj
> > > netem_bug_test.o sec classifier/pass classid 1:1
> > > sudo tc qdisc add dev lo parent 1:1 handle 10: netem limit 4 duplicate 100%
> > > sudo tc qdisc add dev lo parent 10: handle 30: netem gap 1 limit 4
> > > duplicate 100% delay 1us reorder 100%
> > >
> > > And the ping 127.0.0.1 -c 1
> > > I had to fix your patch for correctness (attached)
> > >
> > >
> > > the ebpf prog is trivial - make it just return the classid or even zero.
> >
> > Interesting, are you sure this works before my patch?
> >
> > I don't intend to change any logic except closing the infinite loop. IOW,
> > if it didn't work before, I don't expect to make it work with this patch,
> > this patch merely fixes the infinite loop, which is sufficient as a bug fix.
> > Otherwise it would become a feature improvement. (Don't get me wrong, I
> > think this feature should be improved rather than simply forbidden, it just
> > belongs to a different patch.)
> 
> A quick solution is what William had. I asked him to use ext_cb not
> because i think it is a better solution but just so we can move
> forward.

I already posted a patch, instead of just arguing. Now you are arguing
about the patch I posted...

Thanks.

  parent reply	other threads:[~2025-07-07 19:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-01 23:13 [Patch net 0/2] netem: Fix skb duplication logic to prevent infinite loops Cong Wang
2025-07-01 23:13 ` [Patch net 1/2] " Cong Wang
2025-07-02  1:57   ` Cong Wang
2025-07-02 14:12     ` Jamal Hadi Salim
2025-07-02 15:04       ` Jamal Hadi Salim
2025-07-02 15:06         ` Jamal Hadi Salim
2025-07-02 15:20           ` William Liu
2025-07-05  0:48         ` Cong Wang
2025-07-05 13:52           ` Jamal Hadi Salim
2025-07-06 14:59             ` William Liu
2025-07-07 20:49               ` Jamal Hadi Salim
2025-07-07 21:26                 ` Jakub Kicinski
2025-07-08 13:18                   ` Jamal Hadi Salim
2025-07-07 19:40             ` Cong Wang [this message]
2025-07-07 20:24               ` Jamal Hadi Salim
2025-07-01 23:13 ` [Patch net 2/2] selftests/tc-testing: Add a nested netem duplicate test Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aGwiuDju8TNvRdGe@pop-os.localdomain \
    --to=xiyou.wangcong@gmail.com \
    --cc=jhs@mojatatu.com \
    --cc=netdev@vger.kernel.org \
    --cc=savy@syst3mfailure.io \
    --cc=stephen@networkplumber.org \
    --cc=will@willsroot.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox