All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lee Jones <lee@kernel.org>
To: lee@kernel.org, "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Kuniyuki Iwashima <kuniyu@amazon.com>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
	Michal Luczaj <mhal@rbox.co>, Rao Shoaib <Rao.Shoaib@oracle.com>,
	Pavel Begunkov <asml.silence@gmail.com>,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Cc: stable@vger.kernel.org
Subject: [PATCH v6.6 14/26] af_unix: Fix up unix_edge.successor for embryo socket.
Date: Wed, 21 May 2025 14:45:22 +0000	[thread overview]
Message-ID: <20250521144803.2050504-15-lee@kernel.org> (raw)
In-Reply-To: <20250521144803.2050504-1-lee@kernel.org>

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit dcf70df2048d27c5d186f013f101a4aefd63aa41 ]

To garbage collect inflight AF_UNIX sockets, we must define the
cyclic reference appropriately.  This is a bit tricky if the loop
consists of embryo sockets.

Suppose that the fd of AF_UNIX socket A is passed to D and the fd B
to C and that C and D are embryo sockets of A and B, respectively.
It may appear that there are two separate graphs, A (-> D) and
B (-> C), but this is not correct.

     A --. .-- B
          X
     C <-' `-> D

Now, D holds A's refcount, and C has B's refcount, so unix_release()
will never be called for A and B when we close() them.  However, no
one can call close() for D and C to free skbs holding refcounts of A
and B because C/D is in A/B's receive queue, which should have been
purged by unix_release() for A and B.

So, here's another type of cyclic reference.  When a fd of an AF_UNIX
socket is passed to an embryo socket, the reference is indirectly held
by its parent listening socket.

  .-> A                            .-> B
  |   `- sk_receive_queue          |   `- sk_receive_queue
  |      `- skb                    |      `- skb
  |         `- sk == C             |         `- sk == D
  |            `- sk_receive_queue |           `- sk_receive_queue
  |               `- skb +---------'               `- skb +-.
  |                                                         |
  `---------------------------------------------------------'

Technically, the graph must be denoted as A <-> B instead of A (-> D)
and B (-> C) to find such a cyclic reference without touching each
socket's receive queue.

  .-> A --. .-- B <-.
  |        X        |  ==  A <-> B
  `-- C <-' `-> D --'

We apply this fixup during GC by fetching the real successor by
unix_edge_successor().

When we call accept(), we clear unix_sock.listener under unix_gc_lock
not to confuse GC.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240325202425.60930-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit dcf70df2048d27c5d186f013f101a4aefd63aa41)
Signed-off-by: Lee Jones <lee@kernel.org>
---
 include/net/af_unix.h |  1 +
 net/unix/af_unix.c    |  2 +-
 net/unix/garbage.c    | 20 +++++++++++++++++++-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/net/af_unix.h b/include/net/af_unix.h
index d6b755b254a17..9d92dd608fc42 100644
--- a/include/net/af_unix.h
+++ b/include/net/af_unix.h
@@ -24,6 +24,7 @@ void unix_inflight(struct user_struct *user, struct file *fp);
 void unix_notinflight(struct user_struct *user, struct file *fp);
 void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver);
 void unix_del_edges(struct scm_fp_list *fpl);
+void unix_update_edges(struct unix_sock *receiver);
 int unix_prepare_fpl(struct scm_fp_list *fpl);
 void unix_destroy_fpl(struct scm_fp_list *fpl);
 void unix_gc(void);
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4d4c035ba626d..93316e9efc532 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1705,7 +1705,7 @@ static int unix_accept(struct socket *sock, struct socket *newsock, int flags,
 	}
 
 	tsk = skb->sk;
-	unix_sk(tsk)->listener = NULL;
+	unix_update_edges(unix_sk(tsk));
 	skb_free_datagram(sk, skb);
 	wake_up_interruptible(&unix_sk(sk)->peer_wait);
 
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index cdeff548e1307..6ff7e0b5c5444 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -101,6 +101,17 @@ struct unix_sock *unix_get_socket(struct file *filp)
 	return NULL;
 }
 
+static struct unix_vertex *unix_edge_successor(struct unix_edge *edge)
+{
+	/* If an embryo socket has a fd,
+	 * the listener indirectly holds the fd's refcnt.
+	 */
+	if (edge->successor->listener)
+		return unix_sk(edge->successor->listener)->vertex;
+
+	return edge->successor->vertex;
+}
+
 static LIST_HEAD(unix_unvisited_vertices);
 
 enum unix_vertex_index {
@@ -209,6 +220,13 @@ void unix_del_edges(struct scm_fp_list *fpl)
 	fpl->inflight = false;
 }
 
+void unix_update_edges(struct unix_sock *receiver)
+{
+	spin_lock(&unix_gc_lock);
+	receiver->listener = NULL;
+	spin_unlock(&unix_gc_lock);
+}
+
 int unix_prepare_fpl(struct scm_fp_list *fpl)
 {
 	struct unix_vertex *vertex;
@@ -268,7 +286,7 @@ static void __unix_walk_scc(struct unix_vertex *vertex)
 
 	/* Explore neighbour vertices (receivers of the current vertex's fd). */
 	list_for_each_entry(edge, &vertex->edges, vertex_entry) {
-		struct unix_vertex *next_vertex = edge->successor->vertex;
+		struct unix_vertex *next_vertex = unix_edge_successor(edge);
 
 		if (!next_vertex)
 			continue;
-- 
2.49.0.1112.g889b7c5bd8-goog


  parent reply	other threads:[~2025-05-21 14:51 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-21 14:45 [PATCH v6.6 00/26] af_unix: Align with upstream to avoid a potential UAF Lee Jones
2025-05-21 14:45 ` [PATCH v6.6 01/26] af_unix: Return struct unix_sock from unix_get_socket() Lee Jones
2025-05-22  2:03   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 02/26] af_unix: Run GC on only one CPU Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 03/26] af_unix: Try to run GC async Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 04/26] af_unix: Replace BUG_ON() with WARN_ON_ONCE() Lee Jones
2025-05-22  2:08   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 05/26] af_unix: Remove io_uring code for GC Lee Jones
2025-05-22  2:07   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 06/26] af_unix: Remove CONFIG_UNIX_SCM Lee Jones
2025-05-22  2:03   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 07/26] af_unix: Allocate struct unix_vertex for each inflight AF_UNIX fd Lee Jones
2025-05-22  2:08   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 08/26] af_unix: Allocate struct unix_edge " Lee Jones
2025-05-22  2:06   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 09/26] af_unix: Link struct unix_edge when queuing skb Lee Jones
2025-05-22  2:05   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 10/26] af_unix: Bulk update unix_tot_inflight/unix_inflight " Lee Jones
2025-05-22  2:03   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 11/26] af_unix: Iterate all vertices by DFS Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 12/26] af_unix: Detect Strongly Connected Components Lee Jones
2025-05-22  2:06   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 13/26] af_unix: Save listener for embryo socket Lee Jones
2025-05-22  2:08   ` Sasha Levin
2025-05-21 14:45 ` Lee Jones [this message]
2025-05-22  2:04   ` [PATCH v6.6 14/26] af_unix: Fix up unix_edge.successor " Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 15/26] af_unix: Save O(n) setup of Tarjan's algo Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 16/26] af_unix: Skip GC if no cycle exists Lee Jones
2025-05-22  2:07   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 17/26] af_unix: Avoid Tarjan's algorithm if unnecessary Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 18/26] af_unix: Assign a unique index to SCC Lee Jones
2025-05-22  2:07   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 19/26] af_unix: Detect dead SCC Lee Jones
2025-05-22  2:05   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 20/26] af_unix: Replace garbage collection algorithm Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 21/26] af_unix: Remove lock dance in unix_peek_fds() Lee Jones
2025-05-22  2:07   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 22/26] af_unix: Try not to hold unix_gc_lock during accept() Lee Jones
2025-05-22  2:07   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 23/26] af_unix: Don't access successor in unix_del_edges() during GC Lee Jones
2025-05-22  2:08   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 24/26] af_unix: Add dead flag to struct scm_fp_list Lee Jones
2025-05-22  2:05   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 25/26] af_unix: Fix garbage collection of embryos carrying OOB with SCM_RIGHTS Lee Jones
2025-05-22  2:04   ` Sasha Levin
2025-05-21 14:45 ` [PATCH v6.6 26/26] af_unix: Fix uninit-value in __unix_walk_scc() Lee Jones
2025-05-22  2:07   ` Sasha Levin
2025-05-29 12:26 ` [PATCH v6.6 00/26] af_unix: Align with upstream to avoid a potential UAF Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250521144803.2050504-15-lee@kernel.org \
    --to=lee@kernel.org \
    --cc=Rao.Shoaib@oracle.com \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhal@rbox.co \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.