From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B2EDC433F5 for ; Fri, 1 Apr 2022 12:10:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245554AbiDAMLv (ORCPT ); Fri, 1 Apr 2022 08:11:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234174AbiDAMLu (ORCPT ); Fri, 1 Apr 2022 08:11:50 -0400 Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:12e:520::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82547C6EEF; Fri, 1 Apr 2022 05:10:00 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1naG6I-0007Oy-V9; Fri, 01 Apr 2022 14:09:58 +0200 Date: Fri, 1 Apr 2022 14:09:58 +0200 From: Florian Westphal To: Jaco Kroon Cc: Florian Westphal , Eric Dumazet , Neal Cardwell , LKML , Netdev , Yuchung Cheng Subject: Re: linux 5.17.1 disregarding ACK values resulting in stalled TCP connections Message-ID: <20220401120958.GA28321@breakpoint.cc> References: <10c1e561-8f01-784f-c4f4-a7c551de0644@uls.co.za> <5f1bbeb2-efe4-0b10-bc76-37eff30ea905@uls.co.za> <20220401001531.GB9545@breakpoint.cc> <7d08dcfd-6ba0-f972-cee3-4fa0eff8c855@uls.co.za> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7d08dcfd-6ba0-f972-cee3-4fa0eff8c855@uls.co.za> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Jaco Kroon wrote: > > Eric Dumazet wrote: > >> Next step would be to attempt removing _all_ firewalls, especially not > >> common setups like yours. > >> > >> conntrack had a bug preventing TFO deployment for a while, because > >> many boxes kept buggy kernel versions for years. > >> > >> 356d7d88e088687b6578ca64601b0a2c9d145296 netfilter: nf_conntrack: fix > >> tcp_in_window for Fast Open > > Jaco could also try with > > net.netfilter.nf_conntrack_tcp_be_liberal=1 > > > > and, if that helps, with liberal=0 and > > sysctl net.netfilter.nf_conntrack_log_invalid=6 > > > > (check dmesg/syslog/nflog). > > Our core firewalls already had nf_conntrack_tcp_be_liberal for other > reasons (asymmetric routing combined with conntrackd left-over if I > recall), so maybe that's why it got through there ... don't exactly want > to just flip that setting though, is there a way to log if it would have > dropped anything, without actually dropping it (yet)? This means conntrack doesn't tag packets as invalid EVEN if it would consider sequence/ack out-of-window (e.g. due to a bug). I have a hard time seeing how tcp liberal-mode conntrack would be to blame here. Only thing you could also check is if net.netfilter.nf_conntrack_checksum=0 helps (but i doubt it).