public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Leon Hwang <leon.hwang@linux.dev>
To: Eric Dumazet <edumazet@google.com>
Cc: "Jakub Kicinski" <kuba@kernel.org>,
	"Leon Hwang" <leon.huangfu@shopee.com>,
	netdev@vger.kernel.org, "David S . Miller" <davem@davemloft.net>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Simon Horman" <horms@kernel.org>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Shuah Khan" <skhan@linuxfoundation.org>,
	"David Ahern" <dsahern@kernel.org>,
	"Neal Cardwell" <ncardwell@google.com>,
	"Kuniyuki Iwashima" <kuniyu@google.com>,
	"Ilpo Järvinen" <ij@kernel.org>,
	"Ido Schimmel" <idosch@nvidia.com>,
	kerneljasonxing@gmail.com, lance.yang@linux.dev,
	jiayuan.chen@linux.dev, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH net-next] tcp: Add net.ipv4.tcp_purge_receive_queue sysctl
Date: Tue, 3 Mar 2026 16:54:07 +0800	[thread overview]
Message-ID: <03144918-eaa8-461e-915d-232e29a52557@linux.dev> (raw)
In-Reply-To: <CANn89i+Bt_AZK=16nekvs846P7fPWxkRrNaNNBOrH0sB7xS1uQ@mail.gmail.com>



On 3/3/26 16:17, Eric Dumazet wrote:
> On Tue, Mar 3, 2026 at 8:55 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>
>>
>>
>> On 3/3/26 14:26, Leon Hwang wrote:
>>>
>>>
>>> On 3/3/26 11:55, Eric Dumazet wrote:
>>>> On Tue, Mar 3, 2026 at 3:12 AM Leon Hwang <leon.hwang@linux.dev> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 3/3/26 08:22, Jakub Kicinski wrote:
>>>>>> On Mon, 2 Mar 2026 17:55:59 +0800 Leon Hwang wrote:
>>>>>>> On 26/2/26 09:43, Jakub Kicinski wrote:
>>>>>>>> On Wed, 25 Feb 2026 15:46:33 +0800 Leon Hwang wrote:
>>>>>>>>> Issue:
>>>>>>>>> When a TCP socket in the CLOSE_WAIT state receives a RST packet, the
>>>>>>>>> current implementation does not clear the socket's receive queue. This
>>>>>>>>> causes SKBs in the queue to remain allocated until the socket is
>>>>>>>>> explicitly closed by the application. As a consequence:
>>>>>>>>>
>>>>>>>>> 1. The page pool pages held by these SKBs are not released.
>>>>>>>>
>>>>>>>> On what kernel version and driver are you observing this?
>>>>>>>
>>>>>>> # uname -r
>>>>>>> 6.19.0-061900-generic
>>>>>>>
>>>>>>> # ethtool -i eth0
>>>>>>> driver: mlx5_core
>>>>>>> version: 6.19.0-061900-generic
>>>>>>> firmware-version: 26.43.2566 (MT_0000000531)
>>>>>>
>>>>>> Okay... this kernel + driver should just patiently wait for the page
>>>>>> pool to go away.
>>>>>>
>>>>>> What is the actual, end user problem that you're trying to solve?
>>>>>> A few kB of data waiting to be freed is not a huge problem..
>>>>>
>>>>> Yes, it is not a huge problem.
>>>>>
>>>>> The actual end-user issue was discussed in
>>>>> "page_pool: Add page_pool_release_stalled tracepoint" [1].
>>>>>
>>>>> I think it would be useful to provide a way for SREs to purge the
>>>>> receive queue when CLOSE_WAIT TCP sockets receive RST packets. If the
>>>>> NIC, e.g., Mellanox, flaps, the underlying page pool and pages can be
>>>>> released at the same time.
>>>>>
>>>>> Links:
>>>>> [1]
>>>>> https://lore.kernel.org/netdev/b676baa0-2044-4a74-900d-f471620f2896@linux.dev/
>>>>
>>>> Perhaps SRE could use this in an emergency?
>>>>
>>>> ss -t -a state close-wait -K
>>>
>>> This ss command is acceptable in an emergency.
>>>
>>
>> However, once a CLOSE_WAIT TCP socket receives an RST packet, it
>> transitions to the CLOSE state. A socket in the CLOSE state cannot be
>> killed using the ss approach.
>>
>> The SKBs remain in the receive queue of the CLOSE socket until it is
>> closed by the user-space application.
> 
> Why user-space application does not drain the receive queue ?
> 
> Is there a missing EPOLLIN or something ?

The user-space application uses a TCP connection pool. It establishes
several TCP connections at startup and keeps them in the pool.

However, the application does not always drain their receive queues.
Instead, it selects one connection from the pool using a hash algorithm
for communication with the TCP server. When it attempts to write data
through a socket in the CLOSE state, it receives -EPIPE and then closes
it. As a result, TCP connections whose underlying socket state is CLOSE
may retain an SKB in their receive queues if they are not selected for
communication.

I proposed a solution to address this issue: close the TCP connection if
the underlying sk_err is non-zero.

Thanks,
Leon


  reply	other threads:[~2026-03-03  8:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-25  7:46 [RFC PATCH net-next] tcp: Add net.ipv4.tcp_purge_receive_queue sysctl Leon Hwang
2026-02-25  8:31 ` Eric Dumazet
2026-02-25  9:48   ` Leon Hwang
2026-02-26  1:43 ` Jakub Kicinski
2026-03-02  9:55   ` Leon Hwang
2026-03-03  0:22     ` Jakub Kicinski
2026-03-03  2:12       ` Leon Hwang
2026-03-03  3:55         ` Eric Dumazet
2026-03-03  6:26           ` Leon Hwang
2026-03-03  7:55             ` Leon Hwang
2026-03-03  8:17               ` Eric Dumazet
2026-03-03  8:54                 ` Leon Hwang [this message]
2026-03-03  8:56                   ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=03144918-eaa8-461e-915d-232e29a52557@linux.dev \
    --to=leon.hwang@linux.dev \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=idosch@nvidia.com \
    --cc=ij@kernel.org \
    --cc=jiayuan.chen@linux.dev \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=lance.yang@linux.dev \
    --cc=leon.huangfu@shopee.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox