From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexei Starovoitov Subject: Re: [PATCH net] tcp: fix tcp_mtu_probe() vs highest_sack Date: Mon, 30 Oct 2017 23:30:19 -0700 Message-ID: <20171031063018.ezzfd6hitd535lar@ast-mbp> References: <2325466.Xo6SG5M5hd@natalenko.name> <20171026020724.bgobtktvcpkhco4h@ast-mbp> <1509430100.3828.12.camel@edumazet-glaptop3.roam.corp.google.com> <20171031061750.al6gjwa7hknefwfy@ast-mbp> <1509430902.3828.15.camel@edumazet-glaptop3.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: David Miller , Yuchung Cheng , Oleksandr Natalenko , Roman Gushchin , netdev , Neal Cardwell , Lawrence Brakmo To: Eric Dumazet Return-path: Received: from mail-pf0-f195.google.com ([209.85.192.195]:47483 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750915AbdJaGaX (ORCPT ); Tue, 31 Oct 2017 02:30:23 -0400 Received: by mail-pf0-f195.google.com with SMTP id z11so13000558pfk.4 for ; Mon, 30 Oct 2017 23:30:22 -0700 (PDT) Content-Disposition: inline In-Reply-To: <1509430902.3828.15.camel@edumazet-glaptop3.roam.corp.google.com> Sender: netdev-owner@vger.kernel.org List-ID: On Mon, Oct 30, 2017 at 11:21:42PM -0700, Eric Dumazet wrote: > On Mon, 2017-10-30 at 23:17 -0700, Alexei Starovoitov wrote: > > On Mon, Oct 30, 2017 at 11:08:20PM -0700, Eric Dumazet wrote: > > > From: Eric Dumazet > > > > > > Based on SNMP values provided by Roman, Yuchung made the observation > > > that some crashes in tcp_sacktag_walk() might be caused by MTU probing. > > > > > > Looking at tcp_mtu_probe(), I found that when a new skb was placed > > > in front of the write queue, we were not updating tcp highest sack. > > > > > > If one skb is freed because all its content was copied to the new skb > > > (for MTU probing), then tp->highest_sack could point to a now freed skb. > > > > > > Bad things would then happen, including infinite loops. > > > > > > This patch renames tcp_highest_sack_combine() and uses it > > > from tcp_mtu_probe() to fix the bug. > > > > > > Note that I also removed one test against tp->sacked_out, > > > since we want to replace tp->highest_sack regardless of whatever > > > condition, since keeping a stale pointer to freed skb is a recipe > > > for disaster. > > > > > > Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access") > > > Signed-off-by: Eric Dumazet > > > Reported-by: Alexei Starovoitov > > > Reported-by: Roman Gushchin > > > Reported-by: Oleksandr Natalenko > > > > Thanks! > > > > Acked-by: Alexei Starovoitov > > > > wow. a bug from 2007. > > Any idea why it only started to bite us in 4.11 ? > > > > It's not trivial for us to reproduce it, but we will definitely > > test the patch as soon as we can. > > Do you have packet drill test or something for easy repro? > > I tried to cook a packetdrill test but could not trigger the issue. > > When have you started to enable mtu probing ? for some time. somehow 4.6 based kernel didn't trigger it. May be it's a different bug still...