All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	"David S. Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>,
	Shuah Khan <shuah@kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v6 3/3] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run()
Date: Fri, 07 Jan 2022 17:05:58 +0100	[thread overview]
Message-ID: <877dbba5uh.fsf@toke.dk> (raw)
In-Reply-To: <87a6g7a6e3.fsf@toke.dk>

Toke Høiland-Jørgensen <toke@redhat.com> writes:

> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
>> On Thu, Jan 6, 2022 at 11:59 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>> +
>>> +#define NUM_PKTS 10
>>
>> I'm afraid this needs more work.
>> Just bumping above to 1M I got:
>> [  254.165911] ================================
>> [  254.166387] WARNING: inconsistent lock state
>> [  254.166882] 5.16.0-rc7-02011-g64923127f1b3 #3784 Tainted: G           O
>> [  254.167659] --------------------------------
>> [  254.168140] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
>> [  254.168793] swapper/7/0 [HC0[0]:SC1[5]:HE1:SE0] takes:
>> [  254.169373] ffff888113d24220 (&r->producer_lock){+.?.}-{2:2}, at:
>> veth_xmit+0x361/0x830
>> [  254.170317] {SOFTIRQ-ON-W} state was registered at:
>> [  254.170921]   lock_acquire+0x18a/0x450
>> [  254.171371]   _raw_spin_lock+0x2f/0x40
>> [  254.171815]   veth_xdp_xmit+0x1d7/0x8c0
>> [  254.172241]   veth_ndo_xdp_xmit+0x1d/0x50
>> [  254.172689]   bq_xmit_all+0x562/0xc30
>> [  254.173159]   __dev_flush+0xb1/0x220
>> [  254.173586]   xdp_do_flush+0xa/0x20
>> [  254.173983]   xdp_test_run_batch.constprop.25+0x90c/0xf00
>> [  254.174564]   bpf_test_run_xdp_live+0x369/0x480
>> [  254.175038]   bpf_prog_test_run_xdp+0x63f/0xe50
>> [  254.175512]   __sys_bpf+0x688/0x4410
>> [  254.175923]   __x64_sys_bpf+0x75/0xb0
>> [  254.176327]   do_syscall_64+0x34/0x80
>> [  254.176733]   entry_SYSCALL_64_after_hwframe+0x44/0xae
>> [  254.177297] irq event stamp: 130862
>> [  254.177681] hardirqs last  enabled at (130862):
>> [<ffffffff812d0812>] call_rcu+0x2a2/0x640
>> [  254.178561] hardirqs last disabled at (130861):
>> [<ffffffff812d08bd>] call_rcu+0x34d/0x640
>> [  254.179404] softirqs last  enabled at (130814):
>> [<ffffffff83c00534>] __do_softirq+0x534/0x835
>> [  254.180332] softirqs last disabled at (130839):
>> [<ffffffff811389f7>] irq_exit_rcu+0xe7/0x120
>> [  254.181255]
>> [  254.181255] other info that might help us debug this:
>> [  254.181969]  Possible unsafe locking scenario:
>> [  254.183172]   lock(&r->producer_lock);
>> [  254.183590]   <Interrupt>
>> [  254.183893]     lock(&r->producer_lock);
>> [  254.184321]
>> [  254.184321]  *** DEADLOCK ***
>> [  254.184321]
>> [  254.185047] 5 locks held by swapper/7/0:
>> [  254.185501]  #0: ffff8881f6d89db8 ((&ndev->rs_timer)){+.-.}-{0:0},
>> at: call_timer_fn+0xc8/0x440
>> [  254.186496]  #1: ffffffff854415e0 (rcu_read_lock){....}-{1:2}, at:
>> ndisc_send_skb+0x761/0x12e0
>> [  254.187444]  #2: ffffffff85441580 (rcu_read_lock_bh){....}-{1:2},
>> at: ip6_finish_output2+0x2da/0x1e00
>> [  254.188447]  #3: ffffffff85441580 (rcu_read_lock_bh){....}-{1:2},
>> at: __dev_queue_xmit+0x1de/0x2910
>> [  254.189502]  #4: ffffffff854415e0 (rcu_read_lock){....}-{1:2}, at:
>> veth_xmit+0x41/0x830
>> [  254.190455]
>> [  254.190455] stack backtrace:
>> [  254.190963] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G           O
>>   5.16.0-rc7-02011-g64923127f1b3 #3784
>> [  254.192109] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
>> BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>> [  254.193427] Call Trace:
>> [  254.193711]  <IRQ>
>> [  254.193945]  dump_stack_lvl+0x44/0x57
>> [  254.194418]  mark_lock.part.54+0x157b/0x2210
>> [  254.194940]  ? mark_lock.part.54+0xfd/0x2210
>> [  254.195451]  ? print_usage_bug+0x80/0x80
>> [  254.195896]  ? rcu_read_lock_sched_held+0x91/0xc0
>> [  254.196413]  ? rcu_read_lock_bh_held+0xa0/0xa0
>> [  254.196903]  ? rcu_read_lock_bh_held+0xa0/0xa0
>> [  254.197389]  ? find_held_lock+0x33/0x1c0
>> [  254.197826]  ? lock_release+0x3a1/0x650
>> [  254.198251]  ? __stack_depot_save+0x274/0x490
>> [  254.198742]  ? lock_acquire+0x19a/0x450
>> [  254.199175]  ? lock_downgrade+0x690/0x690
>> [  254.199626]  ? do_raw_spin_lock+0x11d/0x270
>> [  254.200091]  ? rwlock_bug.part.2+0x90/0x90
>> [  254.200550]  __lock_acquire+0x151f/0x6310
>> [  254.201000]  ? mark_lock.part.54+0xfd/0x2210
>> [  254.201470]  ? lockdep_hardirqs_on_prepare+0x3f0/0x3f0
>> [  254.202083]  ? lock_is_held_type+0xda/0x130
>> [  254.202592]  ? rcu_read_lock_sched_held+0x91/0xc0
>> [  254.203134]  ? rcu_read_lock_bh_held+0xa0/0xa0
>> [  254.203630]  lock_acquire+0x18a/0x450
>> [  254.204041]  ? veth_xmit+0x361/0x830
>> [  254.204455]  ? lock_release+0x650/0x650
>> [  254.204932]  ? eth_gro_receive+0xc60/0xc60
>> [  254.205421]  ? rcu_read_lock_held+0x91/0xa0
>> [  254.205912]  _raw_spin_lock+0x2f/0x40
>> [  254.206314]  ? veth_xmit+0x361/0x830
>> [  254.206707]  veth_xmit+0x361/0x830
>>
>> I suspect it points out that local_bh_disable is needed
>> around xdp_do_flush.
>>
>> That's why I asked you to test it with something
>> more than 3 in NUM_PKTS.
>> What values did you test it with? I hope not just 10.
>>
>> Please make sure XDP_PASS/TX/REDIRECT are all stress tested.
>
> Okay, finally managed to reproduce this; it did not show up at all in my
> own tests.
>
> Did you run the old version of the selftest by any chance? At least I
> can only reproduce it with the forwarding sysctl enabled; it happens
> because the XDP_PASS path races with the XDP_REDIRECT path and end up
> trying to grab the same lock, which only happens when the XDP_PASS path
> sends the packets back out the same interface. The fix is to extend the
> local_bh_disable() to cover the full loop in xdp_test_run_batch().
>
> I'll send an update with that fixed. But I'm not sure what to do about
> the selftest? I can keep the forwarding enabled + 1 million iterations -
> that seems to trigger the bug fairly reliably for me, but it takes a bit
> longer to run. Is that acceptable?

The absolute difference is just over three seconds on my machine, BTW:

1M pkts:

[root@(none) bpf]# time ./test_progs -a xdp_do_redirect
#221 xdp_do_redirect:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

real	0m5,042s
user	0m0,109s
sys	0m3,968s

10 pkts:

[root@(none) bpf]# time ./test_progs -a xdp_do_redirect
#221 xdp_do_redirect:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

real	0m1,823s
user	0m0,117s
sys	0m0,685s


-Toke


      reply	other threads:[~2022-01-07 16:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-06 19:59 [PATCH bpf-next v6 0/3] Add support for transmitting packets using XDP in bpf_prog_run() Toke Høiland-Jørgensen
2022-01-06 19:59 ` [PATCH bpf-next v6 1/3] bpf: Add "live packet" mode for " Toke Høiland-Jørgensen
2022-01-06 19:59 ` [PATCH bpf-next v6 2/3] selftests/bpf: Move open_netns() and close_netns() into network_helpers.c Toke Høiland-Jørgensen
2022-01-06 19:59 ` [PATCH bpf-next v6 3/3] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run() Toke Høiland-Jørgensen
2022-01-07  0:50   ` Alexei Starovoitov
2022-01-07 15:54     ` Toke Høiland-Jørgensen
2022-01-07 16:05       ` Toke Høiland-Jørgensen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877dbba5uh.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hawk@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.