Netdev List
 help / color / mirror / Atom feed
From: Jiayuan Chen <jiayuan.chen@linux.dev>
To: Cen Zhang <rollkingzzc@gmail.com>,
	"bpf@vger.kernel.org" <bpf@vger.kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Simon Horman <horms@kernel.org>,
	netdev@vger.kernel.org, bpf@vger.kernel.org,
	linux-kernel@vger.kernel.org, zerocling0077@gmail.com,
	2045gemini@gmail.com
Subject: Re: [PATCH] net: skmsg: pin the delayed-work psock in sk_psock_backlog
Date: Fri, 15 May 2026 16:54:43 +0800	[thread overview]
Message-ID: <4a96c17d-0024-4ddb-99b1-c8475b7b95e1@linux.dev> (raw)
In-Reply-To: <c5fadf42-583c-4e8c-8484-46781a0581b2@linux.dev>


On 5/15/26 4:26 PM, Jiayuan Chen wrote:
>
> On 5/15/26 4:12 PM, Cen Zhang wrote:
>> Dear Jiayuan Chen
>>
>> Jiayuan Chen <jiayuan.chen@linux.dev> 于2026年5月15日周五 14:10写道:
>>> Where is the 'last_old_ref_before_put' symbol from? I can't find it
>>> anywhere in the tree.
>>>
>>> If you are using LLMs to dig into races like this, please also have 
>>> them
>>> produce a reproducer, e.g. patch mdelay() into
>>>
>>> the relevant windows to widen them, then trigger it from userspace.
>>>
>>>
>> Hi Jiayuan,
>>
>> Thanks for checking this. You are right: last_old_ref_before_put is
>> not an in-tree kernel symbol. It was a temporary validation probe
>> label which recorded the old psock refcount immediately before the
>> backlog worker's final put, and it should not have appeared in the
>> commit message as if it were kernel output.
>>
>> The in-tree path I was trying to describe is:
>>
>>    sk_psock_backlog() starts at net/core/skmsg.c:670.
>>    get path: sk_psock_get(psock->sk), net/core/skmsg.c:692.
>>    put path: sk_psock_put(psock->sk, psock), net/core/skmsg.c:746.
>>    detach clears sk_user_data at net/core/skmsg.c:892.
>>    reattach publishes a replacement psock at net/core/skmsg.c:793.
>>    warning path: REFCOUNT_SUB_UAF at lib/refcount.c:28.
>>
>> The trigger was based on the in-tree sockmap_redir BPF selftest
>> under tools/testing/selftests/bpf/prog_tests/.
>> The one-shot test used AF_UNIX SOCK_STREAM socket pairs, attached
>> the sk_skb verdict program to the input map, inserted one socket
>> into the input map and one destination socket into the sockmap at
>> key 0, then sent one byte through the input peer so the destination
>> psock backlog worker was queued.
>> For validation I used a temporary local instrumentation patch in
>> net/core/skmsg.c. It added a debugfs-controlled gate in
>> sk_psock_backlog() after the TX-enabled check and before the
>> existing sk_psock_get(psock->sk) call, plus counters and pr_info()
>> snapshots in sk_psock_backlog(), sk_psock_init() and
>> sk_psock_drop(). It also stored the pointer returned by
>> sk_psock_get(psock->sk) for logging. The worker still used the
>> existing get path and the existing sk_psock_put(psock->sk, psock)
>> exit path.
>> With the worker parked before sk_psock_get(psock->sk), the test
>> forked: the child deleted the destination sockmap entry, and the
>> parent retried BPF_NOEXIST update of the same key with the same
>> destination socket fd until reattach succeeded.
>> After the delete completed, the test released the old worker. At
>> that point sk->sk_user_data referred to the replacement psock, while
>
> So, should the fix swap the order of sk->sk_user_data = null and 
> sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED)?
>
>

Please also carry a Fixes tag.


[...]
>>
>> I will send v2 as a new thread after the netdev 24-hour
>> interval, with the lab probe label removed from the commit text.
>> If useful, I can also share the small instrumentation/selftest
>> diff separately to show the exact widened window.
>
> You can just put the kernel patch and userspace program patch in this 
> thread (no need to send a new patch).
>
> Also this patch should be targeted to bpf not net.
>
> -- 
> pw-bot: cr
>
>
>

  reply	other threads:[~2026-05-15  8:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15  5:04 [PATCH] net: skmsg: pin the delayed-work psock in sk_psock_backlog Zhang Cen
2026-05-15  6:09 ` Jiayuan Chen
2026-05-15  8:12   ` Cen Zhang
2026-05-15  8:26     ` Jiayuan Chen
2026-05-15  8:54       ` Jiayuan Chen [this message]
2026-05-15  9:10         ` Cen Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a96c17d-0024-4ddb-99b1-c8475b7b95e1@linux.dev \
    --to=jiayuan.chen@linux.dev \
    --cc=2045gemini@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=rollkingzzc@gmail.com \
    --cc=zerocling0077@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox