From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Borkmann Subject: Re: Linux ECN Handling Date: Mon, 06 Nov 2017 15:08:54 +0100 Message-ID: <5A006CF6.1020608@iogearbox.net> References: <20171019124312.GE16796@breakpoint.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Netdev , Florian Westphal , Mohammad Alizadeh , Lawrence Brakmo To: Neal Cardwell , Steve Ibanez Return-path: Received: from www62.your-server.de ([213.133.104.62]:47915 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753250AbdKFOI6 (ORCPT ); Mon, 6 Nov 2017 09:08:58 -0500 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On 10/24/2017 03:11 AM, Neal Cardwell wrote: > On Mon, Oct 23, 2017 at 6:15 PM, Steve Ibanez wrote: >> Hi All, >> >> I upgraded the kernel on all of our machines to Linux >> 4.13.8-041308-lowlatency. However, I'm still observing the same >> behavior where the source enters a timeout when the CWND=1MSS and it >> receives ECN marks. >> >> Here are the measured flow rates: >> >> >> Here are snapshots of the packet traces at the sources when they both >> enter a timeout at t=1.6sec: >> >> 10.0.0.1 timeout event: >> >> >> 10.0.0.3 timeout event: >> >> >> Both still essentially follow the same sequence of events that I >> mentioned earlier: >> (1) receives an ACK for byte XYZ with the ECN flag set >> (2) stops sending for RTO_min=300ms >> (3) sends a retransmission for byte XYZ >> >> The cwnd samples reported by tcp_probe still indicate that the sources >> are reacting to the ECN marks more than once per window. Here are the >> cwnd samples at the same timeout event mentioned above: >> >> >> Let me know if there is anything else you think I should try. > > Sounds like perhaps cwnd is being set to 0 somewhere in this DCTCP > scenario. Would you be able to add printk statements in > tcp_init_cwnd_reduction(), tcp_cwnd_reduction(), and > tcp_end_cwnd_reduction(), printing the IP:port, tp->snd_cwnd, and > tp->snd_ssthresh? > > Based on the output you may be able to figure out where cwnd is being > set to zero. If not, could you please post the printk output and > tcpdump traces (.pcap, headers-only is fine) from your tests? Hi Steve, do you have any updates on your debugging? > thanks, > neal >