From: Youngmin Nam <youngmin.nam@samsung.com>
To: Neal Cardwell <ncardwell@google.com>, Eric Dumazet <edumazet@google.com>
Cc: Youngmin Nam <youngmin.nam@samsung.com>,
davem@davemloft.net, dsahern@kernel.org, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org, dujeong.lee@samsung.com,
guo88.liu@samsung.com, yiwang.cai@samsung.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
joonki.min@samsung.com, hajun.sung@samsung.com,
d7271.choe@samsung.com, sw.ju@samsung.com
Subject: Re: [PATCH] tcp: check socket state before calling WARN_ON
Date: Wed, 4 Dec 2024 12:26:18 +0900 [thread overview]
Message-ID: <Z0/L2gDjvXVfj1ho@perf> (raw)
In-Reply-To: <CADVnQynUspJL4e3UnZTKps9WmgnE-0ngQnQmn=8gjSmyg4fQ5A@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3157 bytes --]
On Tue, Dec 03, 2024 at 10:34:46AM -0500, Neal Cardwell wrote:
> On Tue, Dec 3, 2024 at 6:07 AM Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Dec 3, 2024 at 9:10 AM Youngmin Nam <youngmin.nam@samsung.com> wrote:
> > >
> > > We encountered the following WARNINGs
> > > in tcp_sacktag_write_queue()/tcp_fastretrans_alert()
> > > which triggered a kernel panic due to panic_on_warn.
> > >
> > > case 1.
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 4 PID: 453 at net/ipv4/tcp_input.c:2026
> > > Call trace:
> > > tcp_sacktag_write_queue+0xae8/0xb60
> > > tcp_ack+0x4ec/0x12b8
> > > tcp_rcv_state_process+0x22c/0xd38
> > > tcp_v4_do_rcv+0x220/0x300
> > > tcp_v4_rcv+0xa5c/0xbb4
> > > ip_protocol_deliver_rcu+0x198/0x34c
> > > ip_local_deliver_finish+0x94/0xc4
> > > ip_local_deliver+0x74/0x10c
> > > ip_rcv+0xa0/0x13c
> > > Kernel panic - not syncing: kernel: panic_on_warn set ...
> > >
> > > case 2.
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 0 PID: 648 at net/ipv4/tcp_input.c:3004
> > > Call trace:
> > > tcp_fastretrans_alert+0x8ac/0xa74
> > > tcp_ack+0x904/0x12b8
> > > tcp_rcv_state_process+0x22c/0xd38
> > > tcp_v4_do_rcv+0x220/0x300
> > > tcp_v4_rcv+0xa5c/0xbb4
> > > ip_protocol_deliver_rcu+0x198/0x34c
> > > ip_local_deliver_finish+0x94/0xc4
> > > ip_local_deliver+0x74/0x10c
> > > ip_rcv+0xa0/0x13c
> > > Kernel panic - not syncing: kernel: panic_on_warn set ...
> > >
> >
> > I have not seen these warnings firing. Neal, have you seen this in the past ?
>
> I can't recall seeing these warnings over the past 5 years or so, and
> (from checking our monitoring) they don't seem to be firing in our
> fleet recently.
>
> > In any case this test on sk_state is too specific.
>
> I agree with Eric. IMHO TCP_FIN_WAIT1 deserves all the same warnings
> as ESTABLISHED, since in this state the connection may still have a
> big queue of data it is trying to reliably send to the other side,
> with full loss recovery and congestion control logic.
Yes I agree with Eric as well.
>
> I would suggest that instead of running with panic_on_warn it would
> make more sense to not panic on warning, and instead add more detail
> to these warning messages in your kernels during your testing, to help
> debug what is going wrong. I would suggest adding to the warning
> message:
>
> tp->packets_out
> tp->sacked_out
> tp->lost_out
> tp->retrans_out
> tcp_is_sack(tp)
> tp->mss_cache
> inet_csk(sk)->icsk_ca_state
> inet_csk(sk)->icsk_pmtu_cookie
Hi Neal.
Thanks for your opinion.
By the way, we enable panic_on_warn by default for stability.
As you know, panic_on_warn is not applied to a specific subsystem but to the entire kernel.
We just want to avoid the kernel panic.
So when I see below lwn article, I think we might use pr_warn() instaed of WARN_ON().
https://lwn.net/Articles/969923/
How do you think of it ?
>
> A hunch would be that this is either firing for (a) non-SACK
> connections, or (b) after an MTU reduction.
>
> In particular, you might try `echo 0 >
> /proc/sys/net/ipv4/tcp_mtu_probing` and see if that makes the warnings
> go away.
>
> cheers,
> neal
>
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2024-12-04 3:23 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20241203081005epcas2p247b3d05bc767b1a50ba85c4433657295@epcas2p2.samsung.com>
2024-12-03 8:12 ` [PATCH] tcp: check socket state before calling WARN_ON Youngmin Nam
2024-12-03 11:07 ` Eric Dumazet
2024-12-03 15:34 ` Neal Cardwell
2024-12-04 2:18 ` Jakub Kicinski
2024-12-04 3:39 ` Youngmin Nam
2024-12-04 7:13 ` Eric Dumazet
2024-12-04 7:48 ` Dujeong.lee
2024-12-04 14:21 ` Neal Cardwell
2024-12-05 12:31 ` Dujeong.lee
2025-01-17 5:08 ` Youngmin Nam
2025-01-17 15:18 ` Neal Cardwell
2025-01-20 0:18 ` Youngmin Nam
2025-02-03 5:21 ` Youngmin Nam
2025-02-24 21:13 ` Neal Cardwell
2025-02-25 17:24 ` Neal Cardwell
2025-02-25 18:28 ` Yuchung Cheng
2025-02-25 18:43 ` Eric Dumazet
2025-03-01 5:37 ` Youngmin Nam
2025-03-14 2:49 ` Youngmin Nam
2024-12-06 5:53 ` Youngmin Nam
2024-12-06 8:35 ` Eric Dumazet
2024-12-06 9:01 ` Youngmin Nam
2024-12-06 9:08 ` Eric Dumazet
2024-12-06 15:34 ` Neal Cardwell
2024-12-09 1:52 ` Youngmin Nam
2024-12-09 1:32 ` Youngmin Nam
2024-12-09 10:16 ` Dujeong.lee
2024-12-09 10:20 ` Eric Dumazet
2024-12-10 3:38 ` Dujeong.lee
2024-12-10 7:10 ` Dujeong.lee
2024-12-18 10:18 ` Dujeong.lee
2024-12-18 10:27 ` Eric Dumazet
2024-12-30 0:23 ` Dujeong.lee
2024-12-30 9:33 ` Eric Dumazet
2025-01-02 0:22 ` Dujeong.lee
2025-01-02 8:16 ` Eric Dumazet
2025-01-03 4:16 ` Dujeong.lee
2024-12-04 3:26 ` Youngmin Nam [this message]
2024-12-04 8:55 ` Eric Dumazet
2024-12-04 3:08 ` Youngmin Nam
2024-12-04 9:03 ` Eric Dumazet
2024-12-05 2:45 ` Youngmin Nam
2024-12-13 7:14 ` Youngmin Nam
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z0/L2gDjvXVfj1ho@perf \
--to=youngmin.nam@samsung.com \
--cc=d7271.choe@samsung.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=dujeong.lee@samsung.com \
--cc=edumazet@google.com \
--cc=guo88.liu@samsung.com \
--cc=hajun.sung@samsung.com \
--cc=horms@kernel.org \
--cc=joonki.min@samsung.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sw.ju@samsung.com \
--cc=yiwang.cai@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.