From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: xietangxin <xietangxin@yeah.net>, Willy Tarreau <w@1wt.eu>,
Eric Dumazet <edumazet@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Neal Cardwell <ncardwell@google.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
netdev@vger.kernel.org, eric.dumazet@gmail.com,
Zhouyan Deng <dengzhouyan_nwpu@163.com>,
Florian Westphal <fw@strlen.de>
Subject: Re: [PATCH net] tcp: secure_seq: add back ports to TS offset
Date: Mon, 8 Jun 2026 21:14:26 +0800 [thread overview]
Message-ID: <0e89ceb5-a2ae-4e7c-8fe2-5b6a89ba6ac5@linux.dev> (raw)
In-Reply-To: <90cb7e92-2451-4c67-9f35-6ff96b7efd77@yeah.net>
On 6/8/26 8:51 PM, xietangxin wrote:
>
> On 6/8/2026 5:42 PM, Willy Tarreau wrote:
>> On Mon, Jun 08, 2026 at 01:51:49AM -0700, Eric Dumazet wrote:
>>> On Sat, Jun 6, 2026 at 4:06 AM xietangxin <xietangxin@yeah.net> wrote:
>>>>
>>>> Hi Eric and netdev,
>>>>
>>>> I noticed a significant TCP performance regression (QPS drop) when using
>>>> iptables MASQUERADE with the `--random-fully` option, and I have bisected
>>>> it down to commit 165573e41f2f66ef98940cf65f838b2cb575d9d1
>>>> (tcp: secure_seq: add back ports to TS offset).
>>>>
>>>> Here is the benchmark environment and test results.
>>>> Environment:
>>>> - Client & Server: 2 VMs
>>>> - Server: Nginx listening on port 80 (HTTP), and ip 10.0.0.1
>>>> - Benchmark tool: wrk (short-lived connections with "Connection: close")
>>>>
>>>> Test Commands
>>>> 1. With random-fully:
>>>> # iptables -t nat -A POSTROUTING -d 10.0.0.1 -p tcp --dport 80 -j MASQUERADE --random-fully
>>>> # wrk -t8 -c200 -H "Connection: close" -d10s --latency http://10.0.0.1:80
>>>> 2. Without random-fully:
>>>> # iptables -t nat -A POSTROUTING -d 10.0.0.1 -p tcp --dport 80 -j MASQUERADE
>>>> # wrk -t8 -c200 -H "Connection: close" -d10s --latency http://10.0.0.1:80
>>>>
>>>> Test Results (QPS):
>>>> 1. Parent Commit (7f083faf59d14c04e01ec05a7507f036c965acf8):
>>>> - with random-fully: 18145.74, 15006.39, 15716.67
>>>> - without random-fully: 18556.36, 16339.22, 21506.02
>>>>
>>>> 2. Bad Commit (165573e41f2f66ef98940cf65f838b2cb575d9d1):
>>>> - with random-fully: 11074.76, 10383.20, 10164.81 <-- (~35% drop)
>>>> - without random-fully: 17310.75, 20279.85, 18399.48
>>>>
>>>> Is this performance degradation an expected side-effect of the security fix,
>>>> or is there any sysctl param we should tune when `--random-fully` is
>>>> required for high-concurrency short connections?
>>> Hi Tangxin
>>>
>>> I do not know why that patch would affect MASQUERADE performance.
>>>
>>> Pablo, Florian, do you have an idea?
>> I suspect it's because MASQUERADE can shuffle the ports around and
>> break the end-to-end mapping. With host-based ISN the increments
>> remain positive regardless of the ports, while with port-based
>> increments if you shuffle ports around, two consecutive uses of
>> the same port can end up showing a decreasing ISN, and some
>> outgoing SYN will get an ACK instead of a SYN-ACK, then send an
>> RST, and a SYN again, causing a degradation.
>>
>> I'm not saying this is necessarily what happens here but based on the
>> commit message description I suspect that this is what's happening
>> here. There's always a tradeoff between ISN secrecy and reliability
>> unfortunately.
>>
>> Willy
> Hi,
>
> Willy, your hypothesis is 100% correct!
> I captured the packets during the benchmark on the bad commit,
> and the trace perfectly shows the "SYN -> ACK -> RST".
>
> Here is the key snippet of the packet trace (Client: 10.0.0.2, Server: 10.0.0.1):
>
> // 1. First connection closes, Server sends last ACK(410615916), entering TIME_WAIT.
> 12105 08:54:39.128861 10.0.0.1 -> 10.0.0.2 TCP 80 → 47824 [ACK] Seq=3315216203 Ack=410615916 TSval=273827652 TSecr=370383870
>
> // 2. ~200ms later, next short-conn reuses port 47824 via MASQUERADE --random-fully
> 47637 08:54:39.332281 10.0.0.2 -> 10.0.0.1 TCP 47824 → 80 [SYN] Seq=559739866 TSval=4137539723 TSecr=0
>
> // 3. Server is sends a ACK with the old connection's expected ACK(410615916).
> 48591 08:54:39.337692 10.0.0.1 -> 10.0.0.2 TCP 80 → 47824 [ACK] Seq=3315216203 Ack=410615916 TSval=273827858 TSecr=370383870
>
> // 4. Client receives the unexpected old ACK, responds with RST, and has to retry the connection.
> 48600 08:54:39.337799 10.0.0.2 -> 10.0.0.1 TCP 47824 → 80 [RST] Seq=410615916 Win=0
>
>
> Are there any architectural recommendations we should consider here,
> or is this considered an acceptable trade-off for security?
It's classic PAWS problem when packets go through NAT/Gateway.
Can you test the performance with following different two configs (client) ?
sysctl -w net.ipv4.tcp_timestamps=2
sysctl -w net.ipv4.tcp_timestamps=0
next prev parent reply other threads:[~2026-06-08 13:14 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 20:55 [PATCH net] tcp: secure_seq: add back ports to TS offset Eric Dumazet
2026-03-02 21:47 ` Kuniyuki Iwashima
2026-03-03 1:41 ` Florian Westphal
2026-03-03 7:39 ` Jörg Sommer
2026-03-05 2:00 ` patchwork-bot+netdevbpf
2026-06-06 11:04 ` xietangxin
2026-06-08 8:51 ` Eric Dumazet
2026-06-08 9:42 ` Willy Tarreau
2026-06-08 12:51 ` xietangxin
2026-06-08 13:14 ` Jiayuan Chen [this message]
2026-06-08 15:06 ` Willy Tarreau
2026-06-08 11:30 ` Pablo Neira Ayuso
2026-06-08 12:11 ` Florian Westphal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e89ceb5-a2ae-4e7c-8fe2-5b6a89ba6ac5@linux.dev \
--to=jiayuan.chen@linux.dev \
--cc=davem@davemloft.net \
--cc=dengzhouyan_nwpu@163.com \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=fw@strlen.de \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
--cc=w@1wt.eu \
--cc=xietangxin@yeah.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox