From: stranche@codeaurora.org
To: Yuchung Cheng <ycheng@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
soheil@google.com, netdev@vger.kernel.org
Subject: Re: WARN_ON in TLP causing RT throttling
Date: Thu, 27 Sep 2018 18:16:24 -0600 [thread overview]
Message-ID: <f4103fc7d67dff06718bbc5a992170cb@codeaurora.org> (raw)
In-Reply-To: <CAK6E8=cMEGMVr+2fpPpoyQ3cqcKb0B1YzzR==bwiP=vQ5oCnmA@mail.gmail.com>
On 2018-09-27 13:14, Yuchung Cheng wrote:
> On Wed, Sep 26, 2018 at 5:09 PM, Eric Dumazet <eric.dumazet@gmail.com>
> wrote:
>>
>>
>>
>> On 09/26/2018 04:46 PM, stranche@codeaurora.org wrote:
>> > Hi Eric,
>> >
>> > Someone recently reported a crash to us on the 4.14.62 kernel where excessive
>> > WARNING prints were spamming the logs and causing watchdog bites. The kernel
>> > does have the following commit by Soheil:
>> > bffd168c3fc5 "tcp: clear tp->packets_out when purging write queue"
>> >
>> > Before this bug we see over 1 second of continuous WARN_ON prints from
>> > tcp_send_loss_probe() like so:
>> >
>> > 7795.530450: <2> tcp_send_loss_probe+0x194/0x1b8
>> > 7795.534833: <2> tcp_write_timer_handler+0xf8/0x1c4
>> > 7795.539492: <2> tcp_write_timer+0x4c/0x74
>> > 7795.543348: <2> call_timer_fn+0xc0/0x1b4
>> > 7795.547113: <2> run_timer_softirq+0x248/0x81c
>> >
>> > Specifically, the prints come from the following check:
>> >
>> > /* Retransmit last segment. */
>> > if (WARN_ON(!skb))
>> > goto rearm_timer;
>> >
>> > Since skb is always NULL, we know there's nothing on the write queue or the
>> > retransmit queue, so we just keep resetting the timer, waiting for more data
>> > to be queued. However, we were able to determine that the TCP socket is in the
>> > TCP_FIN_WAIT1 state, so we will no longer be sending any data and these queues
>> > remain empty.
>> >
>> > Would it be appropriate to stop resetting the TLP timer if we detect that the
>> > connection is starting to close and we have no more data to send the probe with,
>> > or is there some way that this scenario should already be handled?
>> >
>> > Unfortunately, we don't have a reproducer for this crash.
>> >
>>
>> Something is fishy.
>>
>> If there is no skb in the queues, then tp->packets_out should be 0,
>> therefore tcp_rearm_rto() should simply call
>> inet_csk_clear_xmit_timer(sk, ICSK_TIME_RETRANS);
>>
>> I have never seen this report before.
> Do you use Fast Open? I am wondering if its a bug when a TFO server
> closes the socket before the handshake finishes...
>
> Either way, it's pretty safe to just stop TLP if write queue is empty
> for any unexpected reason.
>
>>
Hi Yuchung,
Based on the dumps we were able to get, it appears that TFO was not used
in this case.
We also tried some local experiments where we dropped incoming SYN
packets after already
successful TFO connections on the receive side to see if TFO would
trigger this scenario, but
have not been able to reproduce it.
One other interesting thing we found is that the socket never sent or
received any data. It only
sent/received the packets for the initial handshake and the outgoing
FIN.
next prev parent reply other threads:[~2018-09-28 6:37 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-26 23:46 WARN_ON in TLP causing RT throttling stranche
2018-09-27 0:09 ` Eric Dumazet
2018-09-27 19:14 ` Yuchung Cheng
2018-09-28 0:16 ` stranche [this message]
2018-09-28 0:25 ` Eric Dumazet
2018-09-28 16:20 ` stranche
2018-10-02 21:19 ` Yuchung Cheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f4103fc7d67dff06718bbc5a992170cb@codeaurora.org \
--to=stranche@codeaurora.org \
--cc=eric.dumazet@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=soheil@google.com \
--cc=ycheng@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).