All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bagas Sanjaya <bagasdotme@gmail.com>
To: Martin Zaharinov <micron10@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, netdev <netdev@vger.kernel.org>,
	patchwork-bot+netdevbpf@kernel.org,
	Jakub Kicinski <kuba@kernel.org>,
	Stephen Hemminger <stephen@networkplumber.org>,
	kuba+netdrv@kernel.org, dsahern@gmail.com,
	Florian Westphal <fw@strlen.de>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	Wangyang Guo <wangyang.guo@intel.com>,
	Arjan Van De Ven <arjan.van.de.ven@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Linux Regressions <regressions@lists.linux.dev>
Subject: Re: Urgent Bug Report Kernel crash 6.5.2
Date: Fri, 22 Sep 2023 10:06:31 +0700	[thread overview]
Message-ID: <ZQ0EtyRxjzQTrPNd@debian.me> (raw)
In-Reply-To: <6A98504D-DB99-42A5-A829-B81739822CB2@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3969 bytes --]

On Thu, Sep 21, 2023 at 11:13:55AM +0300, Martin Zaharinov wrote:
> Hi Bagas,
> 
> 
> Its not easy to make this on production, have too many users on it.
> 
> i make checks and find with kernel 6.3.12-6.5.13 all is fine.
> on first machine that i have with kernel 6.4 and still work run kernel 6.4.2 and have problem.
> 
> in my investigation problem is start after migration to kernel 6.4.x 
> 
> in 6.4 kernel is add rcuref : 
> 
> https://cdn.kernel.org/pub/linux/kernel/v6.x/ChangeLog-6.4 
> 
> commit bc9d3a9f2afca189a6ae40225b6985e3c775375e
> Author: Thomas Gleixner <tglx@linutronix.de>
> Date: Thu Mar 23 21:55:32 2023 +0100
> 
> net: dst: Switch to rcuref_t reference counting

Is it the culprit you look for? Had you done the bisection and it points
the culprit to that commit

> 
> Under high contention dst_entry::__refcnt becomes a significant bottleneck.
> 
> atomic_inc_not_zero() is implemented with a cmpxchg() loop, which goes into
> high retry rates on contention.
> 
> Switch the reference count to rcuref_t which results in a significant
> performance gain. Rename the reference count member to __rcuref to reflect
> the change.
> 
> The gain depends on the micro-architecture and the number of concurrent
> operations and has been measured in the range of +25% to +130% with a
> localhost memtier/memcached benchmark which amplifies the problem
> massively.
> 
> Running the memtier/memcached benchmark over a real (1Gb) network
> connection the conversion on top of the false sharing fix for struct
> dst_entry::__refcnt results in a total gain in the 2%-5% range over the
> upstream baseline.
> 
> Reported-by: Wangyang Guo <wangyang.guo@intel.com>
> Reported-by: Arjan Van De Ven <arjan.van.de.ven@intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Link: https://lore.kernel.org/r/20230307125538.989175656@linutronix.de
> Link: https://lore.kernel.org/r/20230323102800.215027837@linutronix.de
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> 
> 
> and i think problem is here : 
> 
> --- a/net/core/dst.c
> +++ b/net/core/dst.c
> @@ -66,7 +66,7 @@ void dst_init(struct dst_entry *dst, str
> dst->tclassid = 0;
> #endif
> dst->lwtstate = NULL;
> - atomic_set(&dst->__refcnt, initial_ref);
> + rcuref_init(&dst->__refcnt, initial_ref);
> dst->__use = 0;
> dst->lastuse = jiffies;
> dst->flags = flags;
> @@ -162,31 +162,15 @@ EXPORT_SYMBOL(dst_dev_put);
> 
> void dst_release(struct dst_entry *dst)
> {
> - if (dst) {
> - int newrefcnt;
> -
> - newrefcnt = atomic_dec_return(&dst->__refcnt);
> - if (WARN_ONCE(newrefcnt < 0, "dst_release underflow"))
> - net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
> - __func__, dst, newrefcnt);
> - if (!newrefcnt)
> - call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu);
> - }
> + if (dst && rcuref_put(&dst->__refcnt))
> + call_rcu_hurry(&dst->rcu_head, dst_destroy_rcu);
> }
> EXPORT_SYMBOL(dst_release);
> 
> void dst_release_immediate(struct dst_entry *dst)
> {
> - if (dst) {
> - int newrefcnt;
> -
> - newrefcnt = atomic_dec_return(&dst->__refcnt);
> - if (WARN_ONCE(newrefcnt < 0, "dst_release_immediate underflow"))
> - net_warn_ratelimited("%s: dst:%p refcnt:%d\n",
> - __func__, dst, newrefcnt);
> - if (!newrefcnt)
> - dst_destroy(dst);
> - }
> + if (dst && rcuref_put(&dst->__refcnt))
> + dst_destroy(dst);
> }
> EXPORT_SYMBOL(dst_release_immediate);
> 
> 
> but this is my thinking
> 

What do you think that above causes your regression?

Confused...

[To Thorsten: I'm unsure if the reporter do the bisection and suddenly he found
the culprit commit. Should I add it to regzbot? I had dealt with this reporter
before when he reported nginx regression and he didn't respond with bisection
to the point that I had to mark it as inconclusive (see regzbot dashboard).
What advice can you provide to him?]

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2023-09-22  3:06 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-15  4:05 Urgent Bug Report Kernel crash 6.5.2 Martin Zaharinov
2023-09-15  6:45 ` Eric Dumazet
2023-09-15 22:23   ` Martin Zaharinov
2023-11-16 14:17   ` Martin Zaharinov
2023-12-06 22:26     ` Martin Zaharinov
     [not found]       ` <5E63894D-913B-416C-B901-F628BB6C00E0@gmail.com>
2023-12-08 22:20         ` Thomas Gleixner
2023-12-08 23:01           ` Martin Zaharinov
2023-12-12 18:16             ` Thomas Gleixner
2023-12-19  9:25               ` Martin Zaharinov
2023-12-19 14:26                 ` Thomas Gleixner
2023-12-22 17:26                   ` Martin Zaharinov
2023-12-29 12:00                     ` Martin Zaharinov
2024-01-04 20:51                       ` Martin Zaharinov
2024-01-07 11:03                         ` Martin Zaharinov
2023-09-15 23:00 ` Martin Zaharinov
2023-09-15 23:11   ` Martin Zaharinov
2023-09-16  8:27     ` Paolo Abeni
     [not found]       ` <CALidq=UR=3rOHZczCnb1bEhbt9So60UZ5y60Cdh4aP41FkB5Tw@mail.gmail.com>
2023-09-17 11:35         ` Martin Zaharinov
2023-09-17 11:40         ` Martin Zaharinov
2023-09-17 11:55           ` Martin Zaharinov
2023-09-17 12:04             ` Holger Hoffstätte
2023-09-18  8:09             ` Eric Dumazet
2023-09-19 20:09             ` Martin Zaharinov
2023-09-20  3:59               ` Eric Dumazet
2023-09-20  6:05                 ` Martin Zaharinov
2023-09-20  6:16                   ` Bagas Sanjaya
2023-09-20  7:03                     ` Martin Zaharinov
2023-09-20  7:25                       ` Eric Dumazet
2023-09-20  7:29                         ` Eric Dumazet
2023-09-20  7:32                           ` Martin Zaharinov
2023-09-21  7:50                             ` Bagas Sanjaya
2023-09-21  8:13                               ` Martin Zaharinov
2023-09-22  3:06                                 ` Bagas Sanjaya [this message]
2023-09-22  9:50                                   ` Linux regression tracking (Thorsten Leemhuis)
2023-09-22 11:09                                     ` Bagas Sanjaya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZQ0EtyRxjzQTrPNd@debian.me \
    --to=bagasdotme@gmail.com \
    --cc=arjan.van.de.ven@intel.com \
    --cc=dsahern@gmail.com \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=kuba+netdrv@kernel.org \
    --cc=kuba@kernel.org \
    --cc=micron10@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=patchwork-bot+netdevbpf@kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=stephen@networkplumber.org \
    --cc=tglx@linutronix.de \
    --cc=wangyang.guo@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.