From: Oleksandr Natalenko <oleksandr@natalenko.name>
To: Yuchung Cheng <ycheng@google.com>
Cc: Roman Gushchin <guro@fb.com>,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
netdev <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c
Date: Thu, 28 Sep 2017 10:14:10 +0200 [thread overview]
Message-ID: <2325466.Xo6SG5M5hd@natalenko.name> (raw)
In-Reply-To: <CAK6E8=c3mNnjuWDaA8rjnCjn8mAih=ReQiZwGUgB5X2xrjaRiA@mail.gmail.com>
Hi.
Won't tell about panic in tcp_sacktag_walk() since I cannot trigger it
intentionally, but setting net.ipv4.tcp_retrans_collapse to 0 *does not* fix
warning in tcp_fastretrans_alert() for me.
On středa 27. září 2017 2:18:32 CEST Yuchung Cheng wrote:
> On Tue, Sep 26, 2017 at 5:12 PM, Yuchung Cheng <ycheng@google.com> wrote:
> > On Tue, Sep 26, 2017 at 6:10 AM, Roman Gushchin <guro@fb.com> wrote:
> >>> On Wed, Sep 20, 2017 at 6:46 PM, Roman Gushchin <guro@fb.com> wrote:
> >>> > > Hello.
> >>> > >
> >>> > > Since, IIRC, v4.11, there is some regression in TCP stack resulting
> >>> > > in the
> >>> > > warning shown below. Most of the time it is harmless, but rarely it
> >>> > > just
> >>> > > causes either freeze or (I believe, this is related too) panic in
> >>> > > tcp_sacktag_walk() (because sk_buff passed to this function is
> >>> > > NULL).
> >>> > > Unfortunately, I still do not have proper stacktrace from panic, but
> >>> > > will try to capture it if possible.
> >>> > >
> >>> > > Also, I have custom settings regarding TCP stack, shown below as
> >>> > > well. ifb is used to shape traffic with tc.
> >>> > >
> >>> > > Please note this regression was already reported as BZ [1] and as a
> >>> > > letter to ML [2], but got neither attention nor resolution. It is
> >>> > > reproducible for (not only) me on my home router since v4.11 till
> >>> > > v4.13.1 incl.
> >>> > >
> >>> > > Please advise on how to deal with it. I'll provide any additional
> >>> > > info if
> >>> > > necessary, also ready to test patches if any.
> >>> > >
> >>> > > Thanks.
> >>> > >
> >>> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=195835
> >>> > > [2]
> >>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.ne
> >>> > > t_lists_netdev_msg436158.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=jJ
> >>> > > YgtDM7QT-W-Fz_d29HYQ&m=MDDRfLG5DvdOeniMpaZDJI8ulKQ6PQ6OX_1YtRsiTMA&s
> >>> > > =-n3dGZw-pQ95kMBUfq5G9nYZFcuWtbTDlYFkcvQPoKc&e=>>> >
> >>> > We're experiencing the same problems on some machines in our fleet.
> >>> > Exactly the same symptoms: tcp_fastretrans_alert() warnings and
> >>> > sometimes panics in tcp_sacktag_walk().
> >>
> >>> > Here is an example of a backtrace with the panic log:
> >> Hi Yuchung!
> >>
> >>> do you still see the panics if you disable RACK?
> >>> sysctl net.ipv4.tcp_recovery=0?
> >>
> >> No, we haven't seen any crash since that.
> >
> > I am out of ideas how RACK can potentially cause tcp_sacktag_walk to
> > take an empty skb :-( Do you have stack trace or any hint on which call
> > to tcp-sacktag_walk triggered the panic? internally at Google we never
> > see that.
>
> hmm something just struck me: could you try
> sysctl net.ipv4.tcp_recovery=1 net.ipv4.tcp_retrans_collapse=0
> and see if kernel still panics on sack processing?
>
> >>> also have you experience any sack reneg? could you post the output of
> >>> ' nstat |grep -i TCP' thanks
> >>
> >> hostname TcpActiveOpens 2289680 0.0
> >> hostname TcpPassiveOpens 3592758 0.0
> >> hostname TcpAttemptFails 746910 0.0
> >> hostname TcpEstabResets 154988 0.0
> >> hostname TcpInSegs 16258678255 0.0
> >> hostname TcpOutSegs 46967011611 0.0
> >> hostname TcpRetransSegs 13724310 0.0
> >> hostname TcpInErrs 2 0.0
> >> hostname TcpOutRsts 9418798 0.0
> >> hostname TcpExtEmbryonicRsts 2303 0.0
> >> hostname TcpExtPruneCalled 90192 0.0
> >> hostname TcpExtOfoPruned 57274 0.0
> >> hostname TcpExtOutOfWindowIcmps 3 0.0
> >> hostname TcpExtTW 1164705 0.0
> >> hostname TcpExtTWRecycled 2 0.0
> >> hostname TcpExtPAWSEstab 159 0.0
> >> hostname TcpExtDelayedACKs 209207209 0.0
> >> hostname TcpExtDelayedACKLocked 508571 0.0
> >> hostname TcpExtDelayedACKLost 1713248 0.0
> >> hostname TcpExtListenOverflows 625 0.0
> >> hostname TcpExtListenDrops 625 0.0
> >> hostname TcpExtTCPHPHits 9341188489 0.0
> >> hostname TcpExtTCPPureAcks 1434646465 0.0
> >> hostname TcpExtTCPHPAcks 5733614672 0.0
> >> hostname TcpExtTCPSackRecovery 3261698 0.0
> >> hostname TcpExtTCPSACKReneging 12203 0.0
> >> hostname TcpExtTCPSACKReorder 433189 0.0
> >> hostname TcpExtTCPTSReorder 22694 0.0
> >> hostname TcpExtTCPFullUndo 45092 0.0
> >> hostname TcpExtTCPPartialUndo 22016 0.0
> >> hostname TcpExtTCPLossUndo 2150040 0.0
> >> hostname TcpExtTCPLostRetransmit 60119 0.0
> >> hostname TcpExtTCPSackFailures 2626782 0.0
> >> hostname TcpExtTCPLossFailures 182999 0.0
> >> hostname TcpExtTCPFastRetrans 4334275 0.0
> >> hostname TcpExtTCPSlowStartRetrans 3453348 0.0
> >> hostname TcpExtTCPTimeouts 1070997 0.0
> >> hostname TcpExtTCPLossProbes 2633545 0.0
> >> hostname TcpExtTCPLossProbeRecovery 941647 0.0
> >> hostname TcpExtTCPSackRecoveryFail 336302 0.0
> >> hostname TcpExtTCPRcvCollapsed 461354 0.0
> >> hostname TcpExtTCPAbortOnData 349196 0.0
> >> hostname TcpExtTCPAbortOnClose 3395 0.0
> >> hostname TcpExtTCPAbortOnTimeout 51201 0.0
> >> hostname TcpExtTCPMemoryPressures 2 0.0
> >> hostname TcpExtTCPSpuriousRTOs 2120503 0.0
> >> hostname TcpExtTCPSackShifted 2613736 0.0
> >> hostname TcpExtTCPSackMerged 21358743 0.0
> >> hostname TcpExtTCPSackShiftFallback 8769387 0.0
> >> hostname TcpExtTCPBacklogDrop 5 0.0
> >> hostname TcpExtTCPRetransFail 843 0.0
> >> hostname TcpExtTCPRcvCoalesce 949068035 0.0
> >> hostname TcpExtTCPOFOQueue 470118 0.0
> >> hostname TcpExtTCPOFODrop 9915 0.0
> >> hostname TcpExtTCPOFOMerge 9 0.0
> >> hostname TcpExtTCPChallengeACK 90 0.0
> >> hostname TcpExtTCPSYNChallenge 3 0.0
> >> hostname TcpExtTCPFastOpenActive 2089 0.0
> >> hostname TcpExtTCPSpuriousRtxHostQueues 896596 0.0
> >> hostname TcpExtTCPAutoCorking 547386735 0.0
> >> hostname TcpExtTCPFromZeroWindowAdv 28757 0.0
> >> hostname TcpExtTCPToZeroWindowAdv 28761 0.0
> >> hostname TcpExtTCPWantZeroWindowAdv 322431 0.0
> >> hostname TcpExtTCPSynRetrans 3026 0.0
> >> hostname TcpExtTCPOrigDataSent 40976870977 0.0
> >> hostname TcpExtTCPHystartTrainDetect 453920 0.0
> >> hostname TcpExtTCPHystartTrainCwnd 11586273 0.0
> >> hostname TcpExtTCPHystartDelayDetect 10943 0.0
> >> hostname TcpExtTCPHystartDelayCwnd 763554 0.0
> >> hostname TcpExtTCPACKSkippedPAWS 30 0.0
> >> hostname TcpExtTCPACKSkippedSeq 218 0.0
> >> hostname TcpExtTCPWinProbe 2408 0.0
> >> hostname TcpExtTCPKeepAlive 213768 0.0
> >> hostname TcpExtTCPMTUPFail 69 0.0
> >> hostname TcpExtTCPMTUPSuccess 8811 0.0
> >>
> >> Thanks!
next prev parent reply other threads:[~2017-09-28 8:14 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-21 1:46 [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c Roman Gushchin
2017-09-21 17:07 ` Yuchung Cheng
[not found] ` <CAK6E8=cGF+xKiixRVvA=3PVPA7OQta9hVLTgCbKgvYf3e9Eu-A@mail.gmail.com>
2017-09-26 13:10 ` Roman Gushchin
2017-09-27 0:12 ` Yuchung Cheng
2017-09-27 0:18 ` Yuchung Cheng
2017-09-28 8:14 ` Oleksandr Natalenko [this message]
2017-09-28 23:36 ` Yuchung Cheng
2017-10-26 2:07 ` Alexei Starovoitov
2017-10-26 5:37 ` Yuchung Cheng
2017-10-27 20:38 ` Eric Dumazet
2017-10-31 6:08 ` [PATCH net] tcp: fix tcp_mtu_probe() vs highest_sack Eric Dumazet
2017-10-31 6:17 ` Alexei Starovoitov
2017-10-31 6:21 ` Eric Dumazet
2017-10-31 6:30 ` Alexei Starovoitov
2017-11-01 5:50 ` Yuchung Cheng
2017-10-31 13:51 ` Neal Cardwell
2017-11-01 12:20 ` David Miller
2017-11-03 18:22 ` Oleksandr Natalenko
2017-11-03 21:31 ` Eric Dumazet
2017-11-06 22:27 ` [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c Yuchung Cheng
2017-11-10 13:15 ` Oleksandr Natalenko
2017-11-10 13:40 ` Oleksandr Natalenko
-- strict thread matches above, loose matches on Subject: below --
2017-09-10 20:53 Oleksandr Natalenko
2017-09-10 23:59 ` Neal Cardwell
2017-09-15 5:03 ` Oleksandr Natalenko
2017-09-15 14:03 ` Neal Cardwell
2017-09-15 19:04 ` Oleksandr Natalenko
2017-09-17 18:43 ` Oleksandr Natalenko
2017-09-18 17:18 ` Yuchung Cheng
2017-09-18 17:51 ` Yuchung Cheng
2017-09-18 17:59 ` Oleksandr Natalenko
2017-09-18 18:01 ` Yuchung Cheng
2017-09-18 18:04 ` Oleksandr Natalenko
2017-09-18 20:41 ` Oleksandr Natalenko
2017-09-18 20:46 ` Oleksandr Natalenko
2017-09-18 21:40 ` Yuchung Cheng
2017-09-19 11:04 ` Oleksandr Natalenko
2017-09-19 18:16 ` Yuchung Cheng
2017-09-19 16:05 ` Oleksandr Natalenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2325466.Xo6SG5M5hd@natalenko.name \
--to=oleksandr@natalenko.name \
--cc=guro@fb.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).