From: Oleksandr Natalenko <oleksandr@natalenko.name>
To: Yuchung Cheng <ycheng@google.com>
Cc: Roman Gushchin <guro@fb.com>,
Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>,
Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
netdev <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c
Date: Thu, 28 Sep 2017 10:14:10 +0200 [thread overview]
Message-ID: <2325466.Xo6SG5M5hd@natalenko.name> (raw)
In-Reply-To: <CAK6E8=c3mNnjuWDaA8rjnCjn8mAih=ReQiZwGUgB5X2xrjaRiA@mail.gmail.com>
Hi.
Won't tell about panic in tcp_sacktag_walk() since I cannot trigger it
intentionally, but setting net.ipv4.tcp_retrans_collapse to 0 *does not* fix
warning in tcp_fastretrans_alert() for me.
On středa 27. září 2017 2:18:32 CEST Yuchung Cheng wrote:
> On Tue, Sep 26, 2017 at 5:12 PM, Yuchung Cheng <ycheng@google.com> wrote:
> > On Tue, Sep 26, 2017 at 6:10 AM, Roman Gushchin <guro@fb.com> wrote:
> >>> On Wed, Sep 20, 2017 at 6:46 PM, Roman Gushchin <guro@fb.com> wrote:
> >>> > > Hello.
> >>> > >
> >>> > > Since, IIRC, v4.11, there is some regression in TCP stack resulting
> >>> > > in the
> >>> > > warning shown below. Most of the time it is harmless, but rarely it
> >>> > > just
> >>> > > causes either freeze or (I believe, this is related too) panic in
> >>> > > tcp_sacktag_walk() (because sk_buff passed to this function is
> >>> > > NULL).
> >>> > > Unfortunately, I still do not have proper stacktrace from panic, but
> >>> > > will try to capture it if possible.
> >>> > >
> >>> > > Also, I have custom settings regarding TCP stack, shown below as
> >>> > > well. ifb is used to shape traffic with tc.
> >>> > >
> >>> > > Please note this regression was already reported as BZ [1] and as a
> >>> > > letter to ML [2], but got neither attention nor resolution. It is
> >>> > > reproducible for (not only) me on my home router since v4.11 till
> >>> > > v4.13.1 incl.
> >>> > >
> >>> > > Please advise on how to deal with it. I'll provide any additional
> >>> > > info if
> >>> > > necessary, also ready to test patches if any.
> >>> > >
> >>> > > Thanks.
> >>> > >
> >>> > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=195835
> >>> > > [2]
> >>> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.ne
> >>> > > t_lists_netdev_msg436158.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=jJ
> >>> > > YgtDM7QT-W-Fz_d29HYQ&m=MDDRfLG5DvdOeniMpaZDJI8ulKQ6PQ6OX_1YtRsiTMA&s
> >>> > > =-n3dGZw-pQ95kMBUfq5G9nYZFcuWtbTDlYFkcvQPoKc&e=>>> >
> >>> > We're experiencing the same problems on some machines in our fleet.
> >>> > Exactly the same symptoms: tcp_fastretrans_alert() warnings and
> >>> > sometimes panics in tcp_sacktag_walk().
> >>
> >>> > Here is an example of a backtrace with the panic log:
> >> Hi Yuchung!
> >>
> >>> do you still see the panics if you disable RACK?
> >>> sysctl net.ipv4.tcp_recovery=0?
> >>
> >> No, we haven't seen any crash since that.
> >
> > I am out of ideas how RACK can potentially cause tcp_sacktag_walk to
> > take an empty skb :-( Do you have stack trace or any hint on which call
> > to tcp-sacktag_walk triggered the panic? internally at Google we never
> > see that.
>
> hmm something just struck me: could you try
> sysctl net.ipv4.tcp_recovery=1 net.ipv4.tcp_retrans_collapse=0
> and see if kernel still panics on sack processing?
>
> >>> also have you experience any sack reneg? could you post the output of
> >>> ' nstat |grep -i TCP' thanks
> >>
> >> hostname TcpActiveOpens 2289680 0.0
> >> hostname TcpPassiveOpens 3592758 0.0
> >> hostname TcpAttemptFails 746910 0.0
> >> hostname TcpEstabResets 154988 0.0
> >> hostname TcpInSegs 16258678255 0.0
> >> hostname TcpOutSegs 46967011611 0.0
> >> hostname TcpRetransSegs 13724310 0.0
> >> hostname TcpInErrs 2 0.0
> >> hostname TcpOutRsts 9418798 0.0
> >> hostname TcpExtEmbryonicRsts 2303 0.0
> >> hostname TcpExtPruneCalled 90192 0.0
> >> hostname TcpExtOfoPruned 57274 0.0
> >> hostname TcpExtOutOfWindowIcmps 3 0.0
> >> hostname TcpExtTW 1164705 0.0
> >> hostname TcpExtTWRecycled 2 0.0
> >> hostname TcpExtPAWSEstab 159 0.0
> >> hostname TcpExtDelayedACKs 209207209 0.0
> >> hostname TcpExtDelayedACKLocked 508571 0.0
> >> hostname TcpExtDelayedACKLost 1713248 0.0
> >> hostname TcpExtListenOverflows 625 0.0
> >> hostname TcpExtListenDrops 625 0.0
> >> hostname TcpExtTCPHPHits 9341188489 0.0
> >> hostname TcpExtTCPPureAcks 1434646465 0.0
> >> hostname TcpExtTCPHPAcks 5733614672 0.0
> >> hostname TcpExtTCPSackRecovery 3261698 0.0
> >> hostname TcpExtTCPSACKReneging 12203 0.0
> >> hostname TcpExtTCPSACKReorder 433189 0.0
> >> hostname TcpExtTCPTSReorder 22694 0.0
> >> hostname TcpExtTCPFullUndo 45092 0.0
> >> hostname TcpExtTCPPartialUndo 22016 0.0
> >> hostname TcpExtTCPLossUndo 2150040 0.0
> >> hostname TcpExtTCPLostRetransmit 60119 0.0
> >> hostname TcpExtTCPSackFailures 2626782 0.0
> >> hostname TcpExtTCPLossFailures 182999 0.0
> >> hostname TcpExtTCPFastRetrans 4334275 0.0
> >> hostname TcpExtTCPSlowStartRetrans 3453348 0.0
> >> hostname TcpExtTCPTimeouts 1070997 0.0
> >> hostname TcpExtTCPLossProbes 2633545 0.0
> >> hostname TcpExtTCPLossProbeRecovery 941647 0.0
> >> hostname TcpExtTCPSackRecoveryFail 336302 0.0
> >> hostname TcpExtTCPRcvCollapsed 461354 0.0
> >> hostname TcpExtTCPAbortOnData 349196 0.0
> >> hostname TcpExtTCPAbortOnClose 3395 0.0
> >> hostname TcpExtTCPAbortOnTimeout 51201 0.0
> >> hostname TcpExtTCPMemoryPressures 2 0.0
> >> hostname TcpExtTCPSpuriousRTOs 2120503 0.0
> >> hostname TcpExtTCPSackShifted 2613736 0.0
> >> hostname TcpExtTCPSackMerged 21358743 0.0
> >> hostname TcpExtTCPSackShiftFallback 8769387 0.0
> >> hostname TcpExtTCPBacklogDrop 5 0.0
> >> hostname TcpExtTCPRetransFail 843 0.0
> >> hostname TcpExtTCPRcvCoalesce 949068035 0.0
> >> hostname TcpExtTCPOFOQueue 470118 0.0
> >> hostname TcpExtTCPOFODrop 9915 0.0
> >> hostname TcpExtTCPOFOMerge 9 0.0
> >> hostname TcpExtTCPChallengeACK 90 0.0
> >> hostname TcpExtTCPSYNChallenge 3 0.0
> >> hostname TcpExtTCPFastOpenActive 2089 0.0
> >> hostname TcpExtTCPSpuriousRtxHostQueues 896596 0.0
> >> hostname TcpExtTCPAutoCorking 547386735 0.0
> >> hostname TcpExtTCPFromZeroWindowAdv 28757 0.0
> >> hostname TcpExtTCPToZeroWindowAdv 28761 0.0
> >> hostname TcpExtTCPWantZeroWindowAdv 322431 0.0
> >> hostname TcpExtTCPSynRetrans 3026 0.0
> >> hostname TcpExtTCPOrigDataSent 40976870977 0.0
> >> hostname TcpExtTCPHystartTrainDetect 453920 0.0
> >> hostname TcpExtTCPHystartTrainCwnd 11586273 0.0
> >> hostname TcpExtTCPHystartDelayDetect 10943 0.0
> >> hostname TcpExtTCPHystartDelayCwnd 763554 0.0
> >> hostname TcpExtTCPACKSkippedPAWS 30 0.0
> >> hostname TcpExtTCPACKSkippedSeq 218 0.0
> >> hostname TcpExtTCPWinProbe 2408 0.0
> >> hostname TcpExtTCPKeepAlive 213768 0.0
> >> hostname TcpExtTCPMTUPFail 69 0.0
> >> hostname TcpExtTCPMTUPSuccess 8811 0.0
> >>
> >> Thanks!
next prev parent reply other threads:[~2017-09-28 8:14 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-21 1:46 [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c Roman Gushchin
2017-09-21 17:07 ` Yuchung Cheng
[not found] ` <CAK6E8=cGF+xKiixRVvA=3PVPA7OQta9hVLTgCbKgvYf3e9Eu-A@mail.gmail.com>
2017-09-26 13:10 ` Roman Gushchin
2017-09-27 0:12 ` Yuchung Cheng
2017-09-27 0:18 ` Yuchung Cheng
2017-09-28 8:14 ` Oleksandr Natalenko [this message]
2017-09-28 23:36 ` Yuchung Cheng
2017-10-26 2:07 ` Alexei Starovoitov
2017-10-26 5:37 ` Yuchung Cheng
2017-10-27 20:38 ` Eric Dumazet
2017-10-31 6:08 ` [PATCH net] tcp: fix tcp_mtu_probe() vs highest_sack Eric Dumazet
2017-10-31 6:17 ` Alexei Starovoitov
2017-10-31 6:21 ` Eric Dumazet
2017-10-31 6:30 ` Alexei Starovoitov
2017-11-01 5:50 ` Yuchung Cheng
2017-10-31 13:51 ` Neal Cardwell
2017-11-01 12:20 ` David Miller
2017-11-03 18:22 ` Oleksandr Natalenko
2017-11-03 21:31 ` Eric Dumazet
2017-11-06 22:27 ` [REGRESSION] Warning in tcp_fastretrans_alert() of net/ipv4/tcp_input.c Yuchung Cheng
2017-11-10 13:15 ` Oleksandr Natalenko
2017-11-10 13:40 ` Oleksandr Natalenko
-- strict thread matches above, loose matches on Subject: below --
2017-09-10 20:53 Oleksandr Natalenko
2017-09-10 23:59 ` Neal Cardwell
2017-09-15 5:03 ` Oleksandr Natalenko
2017-09-15 14:03 ` Neal Cardwell
2017-09-15 19:04 ` Oleksandr Natalenko
2017-09-17 18:43 ` Oleksandr Natalenko
2017-09-18 17:18 ` Yuchung Cheng
2017-09-18 17:51 ` Yuchung Cheng
2017-09-18 17:59 ` Oleksandr Natalenko
2017-09-18 18:01 ` Yuchung Cheng
2017-09-18 18:04 ` Oleksandr Natalenko
2017-09-18 20:41 ` Oleksandr Natalenko
2017-09-18 20:46 ` Oleksandr Natalenko
2017-09-18 21:40 ` Yuchung Cheng
2017-09-19 11:04 ` Oleksandr Natalenko
2017-09-19 18:16 ` Yuchung Cheng
2017-09-19 16:05 ` Oleksandr Natalenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2325466.Xo6SG5M5hd@natalenko.name \
--to=oleksandr@natalenko.name \
--cc=guro@fb.com \
--cc=kuznet@ms2.inr.ac.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=ycheng@google.com \
--cc=yoshfuji@linux-ipv6.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.