From: "Mickaël Salaün" <mic@digikod.net>
To: "Günther Noack" <gnoack3000@gmail.com>,
"Eric Dumazet" <edumazet@google.com>,
"Kuniyuki Iwashima" <kuniyu@google.com>,
"Paolo Abeni" <pabeni@redhat.com>,
"Willem de Bruijn" <willemb@google.com>,
"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
"Jason Xing" <kerneljasonxing@gmail.com>
Cc: John Johansen <john.johansen@canonical.com>,
Tingmao Wang <m@maowtm.org>,
Justin Suess <utilityemal77@gmail.com>,
Jann Horn <jannh@google.com>,
linux-security-module@vger.kernel.org,
Samasth Norway Ananda <samasth.norway.ananda@oracle.com>,
Matthieu Buffet <matthieu@buffet.re>,
Mikhail Ivanov <ivanov.mikhail1@huawei-partners.com>,
konstantin.meskhidze@huawei.com,
Demi Marie Obenour <demiobenour@gmail.com>,
Alyssa Ross <hi@alyssa.is>,
Tahera Fahimi <fahimitahera@gmail.com>,
netdev@vger.kernel.org
Subject: Re: [PATCH v5 2/9] landlock: Control pathname UNIX domain socket resolution by path
Date: Sun, 8 Mar 2026 10:18:04 +0100 [thread overview]
Message-ID: <20260308.AiYoh5KooBei@digikod.net> (raw)
In-Reply-To: <20260220.82a8adda6f95@gnoack.org>
On Fri, Feb 20, 2026 at 03:33:28PM +0100, Günther Noack wrote:
> +netdev, we could use some advice on the locking approach in af_unix (see below)
>
> On Wed, Feb 18, 2026 at 10:37:14AM +0100, Mickaël Salaün wrote:
> > On Sun, Feb 15, 2026 at 11:51:50AM +0100, Günther Noack wrote:
> > > diff --git a/include/uapi/linux/landlock.h b/include/uapi/linux/landlock.h
> > > index f88fa1f68b77..3a8fc3af0d64 100644
> > > --- a/include/uapi/linux/landlock.h
> > > +++ b/include/uapi/linux/landlock.h
> > > @@ -248,6 +248,15 @@ struct landlock_net_port_attr {
> > > *
> > > * This access right is available since the fifth version of the Landlock
> > > * ABI.
> > > + * - %LANDLOCK_ACCESS_FS_RESOLVE_UNIX: Look up pathname UNIX domain sockets
> > > + * (:manpage:`unix(7)`). On UNIX domain sockets, this restricts both calls to
> > > + * :manpage:`connect(2)` as well as calls to :manpage:`sendmsg(2)` with an
> > > + * explicit recipient address.
> > > + *
> > > + * This access right only applies to connections to UNIX server sockets which
> > > + * were created outside of the newly created Landlock domain (e.g. from within
> > > + * a parent domain or from an unrestricted process). Newly created UNIX
> > > + * servers within the same Landlock domain continue to be accessible.
> >
> > It might help to add a reference to the explicit scope mechanism.
> >
> > Please squash patch 9/9 into this one and also add a reference here to
> > the rationale described in security/landlock.rst
>
> Sounds good, will do.
>
>
> > > +static void unmask_scoped_access(const struct landlock_ruleset *const client,
> > > + const struct landlock_ruleset *const server,
> > > + struct layer_access_masks *const masks,
> > > + const access_mask_t access)
> >
> > This helper should be moved to task.c and factored out with
> > domain_is_scoped(). This should be a dedicated patch.
>
> (already discussed in another follow-up mail)
>
>
> > > +static int hook_unix_find(const struct path *const path, struct sock *other,
> > > + int flags)
> > > +{
> > > + const struct landlock_ruleset *dom_other;
> > > + const struct landlock_cred_security *subject;
> > > + struct layer_access_masks layer_masks;
> > > + struct landlock_request request = {};
> > > + static const struct access_masks fs_resolve_unix = {
> > > + .fs = LANDLOCK_ACCESS_FS_RESOLVE_UNIX,
> > > + };
> > > +
> > > + /* Lookup for the purpose of saving coredumps is OK. */
> > > + if (unlikely(flags & SOCK_COREDUMP))
> > > + return 0;
> > > +
> > > + /* Access to the same (or a lower) domain is always allowed. */
> > > + subject = landlock_get_applicable_subject(current_cred(),
> > > + fs_resolve_unix, NULL);
> > > +
> > > + if (!subject)
> > > + return 0;
> > > +
> > > + if (!landlock_init_layer_masks(subject->domain, fs_resolve_unix.fs,
> > > + &layer_masks, LANDLOCK_KEY_INODE))
> > > + return 0;
> > > +
> > > + /* Checks the layers in which we are connecting within the same domain. */
> > > + dom_other = landlock_cred(other->sk_socket->file->f_cred)->domain;
> >
> > We need to call unix_state_lock(other) before reading it, and check for
> > SOCK_DEAD, and check sk_socket before dereferencing it. Indeed,
> > the socket can be make orphan (see unix_dgram_sendmsg and
> > unix_stream_connect). I *think* a socket cannot be "resurrected" or
> > recycled once dead, so we may assume there is no race condition wrt
> > dom_other, but please double check. This lockless call should be made
> > clear in the LSM hook. It's OK to not lock the socket before
> > security_unix_find() (1) because no LSM might implement and (2) they
> > might not need to lock the socket (e.g. if the caller is not sandboxed).
> >
> > The updated code should look something like this:
> >
> > unix_state_unlock(other);
unix_state_lock(other) of course...
> > if (unlikely(sock_flag(other, SOCK_DEAD) || !other->sk_socket)) {
> > unix_state_unlock(other);
> > return 0;
> > }
> >
> > dom_other = landlock_cred(other->sk_socket->file->f_cred)->domain;
> > unix_state_unlock(other);
>
> Thank you for spotting the locking concern!
>
> @anyone from netdev, could you please advise on the correct locking
> approach here?
>
> * Do we need ot check SOCK_DEAD?
>
> You are saying that we need to do that, but it's not clear to me
> why.
>
> If you look at the places where unix_find_other() is called in
> af_unix.c, then you'll find that all of them check for SOCK_DEAD and
> then restart from unix_find_other() if they do actually discover
> that the socket is dead. I think that this is catching this race
> condition scenario:
>
> * a server socket exists and is alive
> * A client connects: af_unix.c's unix_stream_connect() calls
> unix_find_other() and finds the server socket...
> * (Concurrently): The server closes the socket and enters
> unix_release_sock(). This function:
> 1. disassociates the server sock from the named socket inode
> number in the hash table (=> future unix_find_other() calls
> will fail).
> 2. calls sock_orphan(), which sets SOCK_DEAD.
> * ...(client connection resuming): unix_stream_connect() continues,
> grabs the unix_state_lock(), which apparently protects everything
> here, checks that the socket is not dead - and discovers that it
> IS suddenly dead. This was not supposed to happen. The code
> recovers from that by retrying everything starting with the
> unix_find_other() call. From unix_release_sock(), we know that
> the inode is not associated with the sock any more -- so the
> unix_find_socket_by_inode() call should be failing on the next
> attempt.
>
> (This works with unix_dgram_connect() and unix_dgram_sendmsg() as
> well.)
>
> The comments next to the SOCK_DEAD checks are also suggesting this.
>
> * What lock to use
>
> I am having trouble reasoning about what lock is used for what in
> this code.
It's not clear to me neither, and it looks like it's not consistent
across protocols.
>
> Is it possible that the lock protecting ->sk_socket is the
> ->sk_callback_lock, not the unix_state_lock()? The only callers to
> sk_set_socket are either sock_orphan/sock_graft (both grabbing
> sk_callback_lock), or they are creating new struct sock objects that
> they own exclusively, and don't need locks yet.
>
> Admittedly, in af_unix.c, sock_orphan() and sock_graft() only get
> called in contexts where the unix_state_lock() is held, so it would
> probably work as well to lock that, but it is maybe a more
> fine-grained approach to use sk_callback_lock?
>
>
> So... how about a scheme where we only check for ->sk_socket not being
> NULL:
>
> read_lock_bh(&other->sk_callback_lock);
> struct sock *other_sk = other->sk_socket;
> if (!other_sk) {
> read_unlock_bh(&other->sk_callback_lock);
> return 0;
> }
> /* XXX double check whether we need a lock here too */
> struct file *file = other_sk->file;
> if (!other_file) {
> read_unlock_bh(&other->sk_callback_lock);
> return 0;
> }
> read_unlock_bh(&other->sk_callback_lock);
>
> If this fails, that would in my understanding also be because the
> socket has died after the path lookup. We'd then return 0 (success),
> because we know that the surrounding SOCK_DEAD logic will repeat
> everything starting from the path lookup operation (this time likely
> failing with ECONNREFUSED, but maybe also with a success, if another
> server process was quick enough).
>
> Does this sound reasonable?
Actually, since commit 983512f3a87f ("net: Drop the lock in
skb_may_tx_timestamp()"), we can just use RCU + READ_ONCE(sk_socket) +
READ_ONCE(file). The socket and file should only be freed after the RCU
grace periode. As a safeguard, this commit should be a Depends-on.
However, it is safer to return -ECONNREFULED when sk_socket or file are
NULL.
I would be good to hear from netdev folks though.
TIL, there is an LSM hook for sock_graft().
> –Günther
>
next prev parent reply other threads:[~2026-03-08 9:18 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-15 10:51 [PATCH v5 0/9] landlock: UNIX connect() control by pathname and scope Günther Noack
2026-02-15 10:51 ` [PATCH v5 1/9] lsm: Add LSM hook security_unix_find Günther Noack
2026-02-18 9:36 ` Mickaël Salaün
2026-02-19 13:26 ` Justin Suess
2026-02-19 20:04 ` [PATCH v6] " Justin Suess
2026-02-19 20:26 ` Günther Noack
2026-03-10 22:39 ` Paul Moore
2026-03-11 12:34 ` Justin Suess
2026-03-11 16:08 ` Paul Moore
2026-03-12 11:57 ` Günther Noack
2026-02-20 15:49 ` Günther Noack
2026-02-21 13:22 ` Justin Suess
2026-02-23 16:09 ` Mickaël Salaün
[not found] ` <20260215105158.28132-3-gnoack3000@gmail.com>
[not found] ` <20260217.lievaS8eeng8@digikod.net>
2026-02-20 14:33 ` [PATCH v5 2/9] landlock: Control pathname UNIX domain socket resolution by path Günther Noack
2026-03-08 9:18 ` Mickaël Salaün [this message]
2026-03-10 15:19 ` Sebastian Andrzej Siewior
2026-03-11 4:46 ` Kuniyuki Iwashima
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260308.AiYoh5KooBei@digikod.net \
--to=mic@digikod.net \
--cc=bigeasy@linutronix.de \
--cc=demiobenour@gmail.com \
--cc=edumazet@google.com \
--cc=fahimitahera@gmail.com \
--cc=gnoack3000@gmail.com \
--cc=hi@alyssa.is \
--cc=ivanov.mikhail1@huawei-partners.com \
--cc=jannh@google.com \
--cc=john.johansen@canonical.com \
--cc=kerneljasonxing@gmail.com \
--cc=konstantin.meskhidze@huawei.com \
--cc=kuniyu@google.com \
--cc=linux-security-module@vger.kernel.org \
--cc=m@maowtm.org \
--cc=matthieu@buffet.re \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=samasth.norway.ananda@oracle.com \
--cc=utilityemal77@gmail.com \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox