netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	netdev@vger.kernel.org, netfilter-devel@vger.kernel.org
Cc: Jakub Kicinski <kuba@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Jozsef Kadlecsik <kadlec@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	"Ahmed S. Darwish" <a.darwish@linutronix.de>,
	Eric Dumazet <edumazet@google.com>,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH net-next 9/9] net: sched: Remove Qdisc::running sequence counter
Date: Mon, 18 Oct 2021 10:23:19 -0700	[thread overview]
Message-ID: <1cdc197a-f9c8-34e4-b19c-132dbbbcafb5@gmail.com> (raw)
In-Reply-To: <20211016084910.4029084-10-bigeasy@linutronix.de>



On 10/16/21 1:49 AM, Sebastian Andrzej Siewior wrote:
> From: "Ahmed S. Darwish" <a.darwish@linutronix.de>
> 
> The Qdisc::running sequence counter has two uses:
> 
>   1. Reliably reading qdisc's tc statistics while the qdisc is running
>      (a seqcount read/retry loop at gnet_stats_add_basic()).
> 
>   2. As a flag, indicating whether the qdisc in question is running
>      (without any retry loops).
> 
> For the first usage, the Qdisc::running sequence counter write section,
> qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what
> is actually needed: the raw qdisc's bstats update. A u64_stats sync
> point was thus introduced (in previous commits) inside the bstats
> structure itself. A local u64_stats write section is then started and
> stopped for the bstats updates.
> 
> Use that u64_stats sync point mechanism for the bstats read/retry loop
> at gnet_stats_add_basic().
> 
> For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag,
> accessed with atomic bitops, is sufficient. Using a bit flag instead of
> a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads
> to the SMP barriers implicitly added through raw_read_seqcount() and
> write_seqcount_begin/end() getting removed. All call sites have been
> surveyed though, and no required ordering was identified.
> 
> Now that the qdisc->running sequence counter is no longer used, remove
> it.
> 
> Note, using u64_stats implies no sequence counter protection for 64-bit
> architectures. This can lead to the qdisc tc statistics "packets" vs.
> "bytes" values getting out of sync on rare occasions. The individual
> values will still be valid.
> 
> Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>


I see this has been merged this week end before we could test this thing during work days :/

Just add a rate estimator on a qdisc:

tc qd add dev lo root est 1sec 4sec pfifo

then :

[  140.824352] ------------[ cut here ]------------
[  140.824361] WARNING: CPU: 15 PID: 0 at net/core/gen_stats.c:157 gnet_stats_add_basic+0x97/0xc0
[  140.824378] Modules linked in: ipvlan bonding vfat fat w1_therm i2c_mux_pca954x i2c_mux ds2482 wire cdc_acm ehci_pci ehci_hcd bnx2x mdio xt_TCPMSS ip6table_mangle ip6_tables ipv6
[  140.824413] CPU: 15 PID: 0 Comm: swapper/15 Not tainted 5.15.0-smp-DEV #73
[  140.824415] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 2.48.0 10/02/2019
[  140.824417] RIP: 0010:gnet_stats_add_basic+0x97/0xc0
[  140.824420] Code: 2c 38 4a 03 5c 38 08 48 c7 c6 68 15 51 a4 e8 60 00 c7 ff 44 39 e0 72 db 89 d8 eb 05 31 c0 45 31 ed 4d 01 2e 49 01 46 08 eb 17 <0f> 0b 4d 85 ff 75 96 48 8b 02 48 8b 4a 08 49 01 06 89 c8 49 01 46
[  140.824432] RSP: 0018:ffff99415fbc5e08 EFLAGS: 00010206
[  140.824434] RAX: 0000000080000100 RBX: ffff9939812f41d0 RCX: 0000000000000001
[  140.824436] RDX: ffff99399705e0b0 RSI: 0000000000000000 RDI: ffff99415fbc5e40
[  140.824438] RBP: ffff99415fbc5e30 R08: 0000000000000000 R09: 0000000000000000
[  140.824440] R10: 0000000000000000 R11: ffffffffffffffff R12: ffff99415fbd7740
[  140.824441] R13: dead000000000122 R14: ffff99415fbc5e40 R15: 0000000000000000
[  140.824443] FS:  0000000000000000(0000) GS:ffff99415fbc0000(0000) knlGS:0000000000000000
[  140.824445] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  140.824447] CR2: 000000000087fff0 CR3: 0000000f11610006 CR4: 00000000000606e0
[  140.824449] Call Trace:
[  140.824450]  <IRQ>
[  140.824453]  ? local_bh_enable+0x20/0x20
[  140.824457]  est_timer+0x5e/0x130
[  140.824460]  call_timer_fn+0x2c/0x110
[  140.824464]  expire_timers+0x4c/0xf0
[  140.824467]  __run_timers+0x16f/0x1b0
[  140.824470]  run_timer_softirq+0x1d/0x40
[  140.824473]  __do_softirq+0x142/0x2a1
[  140.824477]  irq_exit_rcu+0x6b/0xb0
[  140.824480]  sysvec_apic_timer_interrupt+0x79/0x90
[  140.824483]  </IRQ>
[  140.824493]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  140.824497] RIP: 0010:cpuidle_enter_state+0x19b/0x300
[  140.824502] Code: ff 45 84 e4 74 20 48 c7 45 c8 00 00 00 00 9c 8f 45 c8 f7 45 c8 00 02 00 00 0f 85 e4 00 00 00 31 ff e8 c9 0d 88 ff fb 8b 45 bc <85> c0 78 52 48 89 de 89 c3 48 6b d3 68 48 8b 4c 16 48 4c 2b 6d b0
[  140.824503] RSP: 0018:ffff99398089be60 EFLAGS: 00000246
[  140.824505] RAX: 0000000000000004 RBX: ffffffffa446cb28 RCX: 000000000000001f
[  140.824506] RDX: 000000000000000f RSI: 0000000000000000 RDI: 0000000000000000
[  140.824507] RBP: ffff99398089beb0 R08: 0000000000000002 R09: 00000020cf9326e4
[  140.824508] R10: 0000000000638824 R11: 0000000000000000 R12: 0000000000000000
[  140.824509] R13: 00000020c9c8d180 R14: ffffc4733fbe1c50 R15: 0000000000000004
[  140.824511]  cpuidle_enter+0x2e/0x40
[  140.824514]  do_idle+0x19f/0x240
[  140.824517]  cpu_startup_entry+0x25/0x30
[  140.824519]  start_secondary+0x7c/0x80
[  140.824521]  secondary_startup_64_no_verify+0xc3/0xcb
[  140.824525] ---[ end trace d64fa4b3dc94b292 ]---



  reply	other threads:[~2021-10-18 17:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-16  8:49 [PATCH net-next 0/9] Try to simplify the gnet_stats and remove qdisc->running sequence counter Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 1/9] gen_stats: Add instead Set the value in __gnet_stats_copy_basic() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 2/9] gen_stats: Add gnet_stats_add_queue() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 3/9] mq, mqprio: Use gnet_stats_add_queue() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 4/9] gen_stats: Move remaining users to gnet_stats_add_queue() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 5/9] u64_stats: Introduce u64_stats_set() Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 6/9] net: sched: Protect Qdisc::bstats with u64_stats Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 7/9] net: sched: Use _bstats_update/set() instead of raw writes Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 8/9] net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types Sebastian Andrzej Siewior
2021-10-16  8:49 ` [PATCH net-next 9/9] net: sched: Remove Qdisc::running sequence counter Sebastian Andrzej Siewior
2021-10-18 17:23   ` Eric Dumazet [this message]
2021-10-18 18:30     ` Eric Dumazet
2021-10-18 19:24       ` Eric Dumazet
2021-10-18 23:53       ` Eric Dumazet
2021-10-19 10:12     ` [PATCH net-next] net: sched: Allow statistics reads from softirq Sebastian Andrzej Siewior
2021-10-19 12:10       ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1cdc197a-f9c8-34e4-b19c-132dbbbcafb5@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=a.darwish@linutronix.de \
    --cc=bigeasy@linutronix.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kadlec@netfilter.org \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=tglx@linutronix.de \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).