* [PATCH v1 net-next 1/7] af_unix: Count cyclic SCC.
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 2/7] af_unix: Simplify GC state Kuniyuki Iwashima
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
__unix_walk_scc() and unix_walk_scc_fast() call unix_scc_cyclic()
for each SCC to check if it forms a cyclic reference, so that we
can skip GC at the following invocations in case all SCCs do not
have any cycles.
If we count the number of cyclic SCCs in __unix_walk_scc(), we can
simplify unix_walk_scc_fast() because the number of cyclic SCCs
only changes when it garbage-collects a SCC.
So, let's count cyclic SCC in __unix_walk_scc() and decrement it
in unix_walk_scc_fast() when performing garbage collection.
Note that we will use this counter in a later patch to check if a
cycle existed in the previous GC run.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/garbage.c | 31 +++++++++++++++++++++----------
1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 65396a4e1b07..9f62d5097973 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -404,9 +404,11 @@ static bool unix_scc_cyclic(struct list_head *scc)
static LIST_HEAD(unix_visited_vertices);
static unsigned long unix_vertex_grouped_index = UNIX_VERTEX_INDEX_MARK2;
-static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_index,
- struct sk_buff_head *hitlist)
+static unsigned long __unix_walk_scc(struct unix_vertex *vertex,
+ unsigned long *last_index,
+ struct sk_buff_head *hitlist)
{
+ unsigned long cyclic_sccs = 0;
LIST_HEAD(vertex_stack);
struct unix_edge *edge;
LIST_HEAD(edge_stack);
@@ -497,8 +499,8 @@ static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_inde
if (unix_vertex_max_scc_index < vertex->scc_index)
unix_vertex_max_scc_index = vertex->scc_index;
- if (!unix_graph_maybe_cyclic)
- unix_graph_maybe_cyclic = unix_scc_cyclic(&scc);
+ if (unix_scc_cyclic(&scc))
+ cyclic_sccs++;
}
list_del(&scc);
@@ -507,13 +509,17 @@ static void __unix_walk_scc(struct unix_vertex *vertex, unsigned long *last_inde
/* Need backtracking ? */
if (!list_empty(&edge_stack))
goto prev_vertex;
+
+ return cyclic_sccs;
}
+static unsigned long unix_graph_cyclic_sccs;
+
static void unix_walk_scc(struct sk_buff_head *hitlist)
{
unsigned long last_index = UNIX_VERTEX_INDEX_START;
+ unsigned long cyclic_sccs = 0;
- unix_graph_maybe_cyclic = false;
unix_vertex_max_scc_index = UNIX_VERTEX_INDEX_START;
/* Visit every vertex exactly once.
@@ -523,18 +529,20 @@ static void unix_walk_scc(struct sk_buff_head *hitlist)
struct unix_vertex *vertex;
vertex = list_first_entry(&unix_unvisited_vertices, typeof(*vertex), entry);
- __unix_walk_scc(vertex, &last_index, hitlist);
+ cyclic_sccs += __unix_walk_scc(vertex, &last_index, hitlist);
}
list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices);
swap(unix_vertex_unvisited_index, unix_vertex_grouped_index);
+ unix_graph_cyclic_sccs = cyclic_sccs;
+ unix_graph_maybe_cyclic = !!unix_graph_cyclic_sccs;
unix_graph_grouped = true;
}
static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
{
- unix_graph_maybe_cyclic = false;
+ unsigned long cyclic_sccs = unix_graph_cyclic_sccs;
while (!list_empty(&unix_unvisited_vertices)) {
struct unix_vertex *vertex;
@@ -551,15 +559,18 @@ static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
scc_dead = unix_vertex_dead(vertex);
}
- if (scc_dead)
+ if (scc_dead) {
+ cyclic_sccs--;
unix_collect_skb(&scc, hitlist);
- else if (!unix_graph_maybe_cyclic)
- unix_graph_maybe_cyclic = unix_scc_cyclic(&scc);
+ }
list_del(&scc);
}
list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices);
+
+ unix_graph_cyclic_sccs = cyclic_sccs;
+ unix_graph_maybe_cyclic = !!unix_graph_cyclic_sccs;
}
static bool gc_in_progress;
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v1 net-next 2/7] af_unix: Simplify GC state.
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 1/7] af_unix: Count cyclic SCC Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 3/7] af_unix: Don't trigger GC from close() if unnecessary Kuniyuki Iwashima
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
GC manages its state by two variables, unix_graph_maybe_cyclic
and unix_graph_grouped, both of which are set to false in the
initial state.
When an AF_UNIX socket is passed to an in-flight AF_UNIX socket,
unix_update_graph() sets unix_graph_maybe_cyclic to true and
unix_graph_grouped to false, making the next GC invocation call
unix_walk_scc() to group SCCs.
Once unix_walk_scc() finishes, sockets in the same SCC are linked
via vertex->scc_entry. Then, unix_graph_grouped is set to true
so that the following GC invocations can skip Tarjan's algorithm
and simply iterate through the list in unix_walk_scc_fast().
In addition, if we know there is at least one cyclic reference,
we set unix_graph_maybe_cyclic to true so that we do not skip GC.
So the state transitions as follows:
(unix_graph_maybe_cyclic, unix_graph_grouped)
=
(false, false) -> (true, false) -> (true, true) or (false, true)
^.______________/________________/
There is no transition to the initial state where both variables
are false.
If we consider the initial state as grouped, we can see that the
GC actually has a tristate.
Let's consolidate two variables into one enum.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/garbage.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 9f62d5097973..7528e2db1293 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -121,8 +121,13 @@ static struct unix_vertex *unix_edge_successor(struct unix_edge *edge)
return edge->successor->vertex;
}
-static bool unix_graph_maybe_cyclic;
-static bool unix_graph_grouped;
+enum {
+ UNIX_GRAPH_NOT_CYCLIC,
+ UNIX_GRAPH_MAYBE_CYCLIC,
+ UNIX_GRAPH_CYCLIC,
+};
+
+static unsigned char unix_graph_state;
static void unix_update_graph(struct unix_vertex *vertex)
{
@@ -132,8 +137,7 @@ static void unix_update_graph(struct unix_vertex *vertex)
if (!vertex)
return;
- unix_graph_maybe_cyclic = true;
- unix_graph_grouped = false;
+ unix_graph_state = UNIX_GRAPH_MAYBE_CYCLIC;
}
static LIST_HEAD(unix_unvisited_vertices);
@@ -536,8 +540,7 @@ static void unix_walk_scc(struct sk_buff_head *hitlist)
swap(unix_vertex_unvisited_index, unix_vertex_grouped_index);
unix_graph_cyclic_sccs = cyclic_sccs;
- unix_graph_maybe_cyclic = !!unix_graph_cyclic_sccs;
- unix_graph_grouped = true;
+ unix_graph_state = cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC;
}
static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
@@ -570,7 +573,7 @@ static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices);
unix_graph_cyclic_sccs = cyclic_sccs;
- unix_graph_maybe_cyclic = !!unix_graph_cyclic_sccs;
+ unix_graph_state = cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC;
}
static bool gc_in_progress;
@@ -582,14 +585,14 @@ static void __unix_gc(struct work_struct *work)
spin_lock(&unix_gc_lock);
- if (!unix_graph_maybe_cyclic) {
+ if (unix_graph_state == UNIX_GRAPH_NOT_CYCLIC) {
spin_unlock(&unix_gc_lock);
goto skip_gc;
}
__skb_queue_head_init(&hitlist);
- if (unix_graph_grouped)
+ if (unix_graph_state == UNIX_GRAPH_CYCLIC)
unix_walk_scc_fast(&hitlist);
else
unix_walk_scc(&hitlist);
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v1 net-next 3/7] af_unix: Don't trigger GC from close() if unnecessary.
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 1/7] af_unix: Count cyclic SCC Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 2/7] af_unix: Simplify GC state Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 4/7] af_unix: Don't call wait_for_unix_gc() on every sendmsg() Kuniyuki Iwashima
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We have been triggering GC on every close() if there is even one
inflight AF_UNIX socket.
This is because the old GC implementation had no idea of the graph
shape formed by SCM_RIGHTS references.
The new GC knows whether there could be a cyclic reference or not,
and we can do better.
Let's not trigger GC from close() if there is no cyclic reference
or GC is already in progress.
While at it, unix_gc() is renamed to unix_schedule_gc() as it does
not actually perform GC since commit 8b90a9f819dc ("af_unix: Run
GC on only one CPU.").
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/af_unix.c | 3 +--
net/unix/af_unix.h | 3 +--
net/unix/garbage.c | 27 +++++++++++++++++----------
3 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 3b44cadaed96..4a80dac56bbd 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -733,8 +733,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
/* ---- Socket is dead now and most probably destroyed ---- */
- if (READ_ONCE(unix_tot_inflight))
- unix_gc(); /* Garbage collect fds */
+ unix_schedule_gc();
}
struct unix_peercred {
diff --git a/net/unix/af_unix.h b/net/unix/af_unix.h
index 59db179df9bb..0fb5b348ad94 100644
--- a/net/unix/af_unix.h
+++ b/net/unix/af_unix.h
@@ -24,13 +24,12 @@ struct unix_skb_parms {
#define UNIXCB(skb) (*(struct unix_skb_parms *)&((skb)->cb))
/* GC for SCM_RIGHTS */
-extern unsigned int unix_tot_inflight;
void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver);
void unix_del_edges(struct scm_fp_list *fpl);
void unix_update_edges(struct unix_sock *receiver);
int unix_prepare_fpl(struct scm_fp_list *fpl);
void unix_destroy_fpl(struct scm_fp_list *fpl);
-void unix_gc(void);
+void unix_schedule_gc(void);
void wait_for_unix_gc(struct scm_fp_list *fpl);
/* SOCK_DIAG */
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 7528e2db1293..190dea73f0ab 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -137,7 +137,7 @@ static void unix_update_graph(struct unix_vertex *vertex)
if (!vertex)
return;
- unix_graph_state = UNIX_GRAPH_MAYBE_CYCLIC;
+ WRITE_ONCE(unix_graph_state, UNIX_GRAPH_MAYBE_CYCLIC);
}
static LIST_HEAD(unix_unvisited_vertices);
@@ -200,7 +200,7 @@ static void unix_free_vertices(struct scm_fp_list *fpl)
}
static DEFINE_SPINLOCK(unix_gc_lock);
-unsigned int unix_tot_inflight;
+static unsigned int unix_tot_inflight;
void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver)
{
@@ -540,7 +540,8 @@ static void unix_walk_scc(struct sk_buff_head *hitlist)
swap(unix_vertex_unvisited_index, unix_vertex_grouped_index);
unix_graph_cyclic_sccs = cyclic_sccs;
- unix_graph_state = cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC;
+ WRITE_ONCE(unix_graph_state,
+ cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC);
}
static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
@@ -573,12 +574,13 @@ static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices);
unix_graph_cyclic_sccs = cyclic_sccs;
- unix_graph_state = cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC;
+ WRITE_ONCE(unix_graph_state,
+ cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC);
}
static bool gc_in_progress;
-static void __unix_gc(struct work_struct *work)
+static void unix_gc(struct work_struct *work)
{
struct sk_buff_head hitlist;
struct sk_buff *skb;
@@ -609,10 +611,16 @@ static void __unix_gc(struct work_struct *work)
WRITE_ONCE(gc_in_progress, false);
}
-static DECLARE_WORK(unix_gc_work, __unix_gc);
+static DECLARE_WORK(unix_gc_work, unix_gc);
-void unix_gc(void)
+void unix_schedule_gc(void)
{
+ if (READ_ONCE(unix_graph_state) == UNIX_GRAPH_NOT_CYCLIC)
+ return;
+
+ if (READ_ONCE(gc_in_progress))
+ return;
+
WRITE_ONCE(gc_in_progress, true);
queue_work(system_dfl_wq, &unix_gc_work);
}
@@ -628,9 +636,8 @@ void wait_for_unix_gc(struct scm_fp_list *fpl)
* Paired with the WRITE_ONCE() in unix_inflight(),
* unix_notinflight(), and __unix_gc().
*/
- if (READ_ONCE(unix_tot_inflight) > UNIX_INFLIGHT_TRIGGER_GC &&
- !READ_ONCE(gc_in_progress))
- unix_gc();
+ if (READ_ONCE(unix_tot_inflight) > UNIX_INFLIGHT_TRIGGER_GC)
+ unix_schedule_gc();
/* Penalise users who want to send AF_UNIX sockets
* but whose sockets have not been received yet.
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v1 net-next 4/7] af_unix: Don't call wait_for_unix_gc() on every sendmsg().
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
` (2 preceding siblings ...)
2025-11-15 2:08 ` [PATCH v1 net-next 3/7] af_unix: Don't trigger GC from close() if unnecessary Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 5/7] af_unix: Refine wait_for_unix_gc() Kuniyuki Iwashima
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
We have been calling wait_for_unix_gc() on every sendmsg() in case
there are too many inflight AF_UNIX sockets.
This is also because the old GC implementation had poor knowledge
of the inflight sockets and had to suspect every sendmsg().
This was improved by commit d9f21b361333 ("af_unix: Try to run GC
async."), but we do not even need to call wait_for_unix_gc() if the
process is not sending AF_UNIX sockets.
The wait_for_unix_gc() call only helps when a malicious process
continues to create cyclic references, and we can detect that
in a better place and slow it down.
Let's move wait_for_unix_gc() to unix_prepare_fpl() that is called
only when AF_UNIX socket fd is passed via SCM_RIGHTS.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/af_unix.c | 4 ----
net/unix/af_unix.h | 1 -
net/unix/garbage.c | 9 ++++++---
3 files changed, 6 insertions(+), 8 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4a80dac56bbd..34952242bd81 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2098,8 +2098,6 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
if (err < 0)
return err;
- wait_for_unix_gc(scm.fp);
-
if (msg->msg_flags & MSG_OOB) {
err = -EOPNOTSUPP;
goto out;
@@ -2393,8 +2391,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
if (err < 0)
return err;
- wait_for_unix_gc(scm.fp);
-
if (msg->msg_flags & MSG_OOB) {
err = -EOPNOTSUPP;
#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
diff --git a/net/unix/af_unix.h b/net/unix/af_unix.h
index 0fb5b348ad94..2f1bfe3217c1 100644
--- a/net/unix/af_unix.h
+++ b/net/unix/af_unix.h
@@ -30,7 +30,6 @@ void unix_update_edges(struct unix_sock *receiver);
int unix_prepare_fpl(struct scm_fp_list *fpl);
void unix_destroy_fpl(struct scm_fp_list *fpl);
void unix_schedule_gc(void);
-void wait_for_unix_gc(struct scm_fp_list *fpl);
/* SOCK_DIAG */
long unix_inq_len(struct sock *sk);
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 190dea73f0ab..280b9b07b1c0 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -282,6 +282,8 @@ void unix_update_edges(struct unix_sock *receiver)
}
}
+static void wait_for_unix_gc(struct scm_fp_list *fpl);
+
int unix_prepare_fpl(struct scm_fp_list *fpl)
{
struct unix_vertex *vertex;
@@ -303,6 +305,8 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
if (!fpl->edges)
goto err;
+ wait_for_unix_gc(fpl);
+
return 0;
err:
@@ -628,7 +632,7 @@ void unix_schedule_gc(void)
#define UNIX_INFLIGHT_TRIGGER_GC 16000
#define UNIX_INFLIGHT_SANE_USER (SCM_MAX_FD * 8)
-void wait_for_unix_gc(struct scm_fp_list *fpl)
+static void wait_for_unix_gc(struct scm_fp_list *fpl)
{
/* If number of inflight sockets is insane,
* force a garbage collect right now.
@@ -642,8 +646,7 @@ void wait_for_unix_gc(struct scm_fp_list *fpl)
/* Penalise users who want to send AF_UNIX sockets
* but whose sockets have not been received yet.
*/
- if (!fpl || !fpl->count_unix ||
- READ_ONCE(fpl->user->unix_inflight) < UNIX_INFLIGHT_SANE_USER)
+ if (READ_ONCE(fpl->user->unix_inflight) < UNIX_INFLIGHT_SANE_USER)
return;
if (READ_ONCE(gc_in_progress))
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v1 net-next 5/7] af_unix: Refine wait_for_unix_gc().
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
` (3 preceding siblings ...)
2025-11-15 2:08 ` [PATCH v1 net-next 4/7] af_unix: Don't call wait_for_unix_gc() on every sendmsg() Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 6/7] af_unix: Remove unix_tot_inflight Kuniyuki Iwashima
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
unix_tot_inflight is a poor metric, only telling the number of
inflight AF_UNXI sockets, and we should use unix_graph_state instead.
Also, if the receiver is catching up with the passed fds, the
sender does not need to schedule GC.
GC only helps unreferenced cyclic SCM_RIGHTS references, and in
such a situation, the malicious sendmsg() will continue to call
wait_for_unix_gc() and hit the UNIX_INFLIGHT_SANE_USER condition.
Let's make only malicious users schedule GC and wait for it to
finish if a cyclic reference exists during the previous GC run.
Then, sane users will pay almost no cost for wait_for_unix_gc().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/garbage.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 280b9b07b1c0..a6929226d40d 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -543,7 +543,7 @@ static void unix_walk_scc(struct sk_buff_head *hitlist)
list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices);
swap(unix_vertex_unvisited_index, unix_vertex_grouped_index);
- unix_graph_cyclic_sccs = cyclic_sccs;
+ WRITE_ONCE(unix_graph_cyclic_sccs, cyclic_sccs);
WRITE_ONCE(unix_graph_state,
cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC);
}
@@ -577,7 +577,7 @@ static void unix_walk_scc_fast(struct sk_buff_head *hitlist)
list_replace_init(&unix_visited_vertices, &unix_unvisited_vertices);
- unix_graph_cyclic_sccs = cyclic_sccs;
+ WRITE_ONCE(unix_graph_cyclic_sccs, cyclic_sccs);
WRITE_ONCE(unix_graph_state,
cyclic_sccs ? UNIX_GRAPH_CYCLIC : UNIX_GRAPH_NOT_CYCLIC);
}
@@ -629,19 +629,12 @@ void unix_schedule_gc(void)
queue_work(system_dfl_wq, &unix_gc_work);
}
-#define UNIX_INFLIGHT_TRIGGER_GC 16000
-#define UNIX_INFLIGHT_SANE_USER (SCM_MAX_FD * 8)
+#define UNIX_INFLIGHT_SANE_USER (SCM_MAX_FD * 8)
static void wait_for_unix_gc(struct scm_fp_list *fpl)
{
- /* If number of inflight sockets is insane,
- * force a garbage collect right now.
- *
- * Paired with the WRITE_ONCE() in unix_inflight(),
- * unix_notinflight(), and __unix_gc().
- */
- if (READ_ONCE(unix_tot_inflight) > UNIX_INFLIGHT_TRIGGER_GC)
- unix_schedule_gc();
+ if (READ_ONCE(unix_graph_state) == UNIX_GRAPH_NOT_CYCLIC)
+ return;
/* Penalise users who want to send AF_UNIX sockets
* but whose sockets have not been received yet.
@@ -649,6 +642,8 @@ static void wait_for_unix_gc(struct scm_fp_list *fpl)
if (READ_ONCE(fpl->user->unix_inflight) < UNIX_INFLIGHT_SANE_USER)
return;
- if (READ_ONCE(gc_in_progress))
+ unix_schedule_gc();
+
+ if (READ_ONCE(unix_graph_cyclic_sccs))
flush_work(&unix_gc_work);
}
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v1 net-next 6/7] af_unix: Remove unix_tot_inflight.
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
` (4 preceding siblings ...)
2025-11-15 2:08 ` [PATCH v1 net-next 5/7] af_unix: Refine wait_for_unix_gc() Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-15 2:08 ` [PATCH v1 net-next 7/7] af_unix: Consolidate unix_schedule_gc() and wait_for_unix_gc() Kuniyuki Iwashima
2025-11-19 3:30 ` [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation patchwork-bot+netdevbpf
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
unix_tot_inflight is no longer used.
Let's remove it.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/garbage.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index a6929226d40d..fe1f74345b66 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -200,7 +200,6 @@ static void unix_free_vertices(struct scm_fp_list *fpl)
}
static DEFINE_SPINLOCK(unix_gc_lock);
-static unsigned int unix_tot_inflight;
void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver)
{
@@ -226,7 +225,6 @@ void unix_add_edges(struct scm_fp_list *fpl, struct unix_sock *receiver)
} while (i < fpl->count_unix);
receiver->scm_stat.nr_unix_fds += fpl->count_unix;
- WRITE_ONCE(unix_tot_inflight, unix_tot_inflight + fpl->count_unix);
out:
WRITE_ONCE(fpl->user->unix_inflight, fpl->user->unix_inflight + fpl->count);
@@ -257,7 +255,6 @@ void unix_del_edges(struct scm_fp_list *fpl)
receiver = fpl->edges[0].successor;
receiver->scm_stat.nr_unix_fds -= fpl->count_unix;
}
- WRITE_ONCE(unix_tot_inflight, unix_tot_inflight - fpl->count_unix);
out:
WRITE_ONCE(fpl->user->unix_inflight, fpl->user->unix_inflight - fpl->count);
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* [PATCH v1 net-next 7/7] af_unix: Consolidate unix_schedule_gc() and wait_for_unix_gc().
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
` (5 preceding siblings ...)
2025-11-15 2:08 ` [PATCH v1 net-next 6/7] af_unix: Remove unix_tot_inflight Kuniyuki Iwashima
@ 2025-11-15 2:08 ` Kuniyuki Iwashima
2025-11-19 3:30 ` [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation patchwork-bot+netdevbpf
7 siblings, 0 replies; 9+ messages in thread
From: Kuniyuki Iwashima @ 2025-11-15 2:08 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
unix_schedule_gc() and wait_for_unix_gc() share some code.
Let's consolidate the two.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
net/unix/af_unix.c | 2 +-
net/unix/af_unix.h | 2 +-
net/unix/garbage.c | 28 +++++++++-------------------
3 files changed, 11 insertions(+), 21 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 34952242bd81..e518116f8171 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -733,7 +733,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
/* ---- Socket is dead now and most probably destroyed ---- */
- unix_schedule_gc();
+ unix_schedule_gc(NULL);
}
struct unix_peercred {
diff --git a/net/unix/af_unix.h b/net/unix/af_unix.h
index 2f1bfe3217c1..c4f1b2da363d 100644
--- a/net/unix/af_unix.h
+++ b/net/unix/af_unix.h
@@ -29,7 +29,7 @@ void unix_del_edges(struct scm_fp_list *fpl);
void unix_update_edges(struct unix_sock *receiver);
int unix_prepare_fpl(struct scm_fp_list *fpl);
void unix_destroy_fpl(struct scm_fp_list *fpl);
-void unix_schedule_gc(void);
+void unix_schedule_gc(struct user_struct *user);
/* SOCK_DIAG */
long unix_inq_len(struct sock *sk);
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index fe1f74345b66..78323d43e63e 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -279,8 +279,6 @@ void unix_update_edges(struct unix_sock *receiver)
}
}
-static void wait_for_unix_gc(struct scm_fp_list *fpl);
-
int unix_prepare_fpl(struct scm_fp_list *fpl)
{
struct unix_vertex *vertex;
@@ -302,7 +300,7 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
if (!fpl->edges)
goto err;
- wait_for_unix_gc(fpl);
+ unix_schedule_gc(fpl->user);
return 0;
@@ -614,21 +612,9 @@ static void unix_gc(struct work_struct *work)
static DECLARE_WORK(unix_gc_work, unix_gc);
-void unix_schedule_gc(void)
-{
- if (READ_ONCE(unix_graph_state) == UNIX_GRAPH_NOT_CYCLIC)
- return;
-
- if (READ_ONCE(gc_in_progress))
- return;
-
- WRITE_ONCE(gc_in_progress, true);
- queue_work(system_dfl_wq, &unix_gc_work);
-}
-
#define UNIX_INFLIGHT_SANE_USER (SCM_MAX_FD * 8)
-static void wait_for_unix_gc(struct scm_fp_list *fpl)
+void unix_schedule_gc(struct user_struct *user)
{
if (READ_ONCE(unix_graph_state) == UNIX_GRAPH_NOT_CYCLIC)
return;
@@ -636,11 +622,15 @@ static void wait_for_unix_gc(struct scm_fp_list *fpl)
/* Penalise users who want to send AF_UNIX sockets
* but whose sockets have not been received yet.
*/
- if (READ_ONCE(fpl->user->unix_inflight) < UNIX_INFLIGHT_SANE_USER)
+ if (user &&
+ READ_ONCE(user->unix_inflight) < UNIX_INFLIGHT_SANE_USER)
return;
- unix_schedule_gc();
+ if (!READ_ONCE(gc_in_progress)) {
+ WRITE_ONCE(gc_in_progress, true);
+ queue_work(system_dfl_wq, &unix_gc_work);
+ }
- if (READ_ONCE(unix_graph_cyclic_sccs))
+ if (user && READ_ONCE(unix_graph_cyclic_sccs))
flush_work(&unix_gc_work);
}
--
2.52.0.rc1.455.g30608eb744-goog
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation.
2025-11-15 2:08 [PATCH v1 net-next 0/7] af_unix: GC cleanup and optimisation Kuniyuki Iwashima
` (6 preceding siblings ...)
2025-11-15 2:08 ` [PATCH v1 net-next 7/7] af_unix: Consolidate unix_schedule_gc() and wait_for_unix_gc() Kuniyuki Iwashima
@ 2025-11-19 3:30 ` patchwork-bot+netdevbpf
7 siblings, 0 replies; 9+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-11-19 3:30 UTC (permalink / raw)
To: Kuniyuki Iwashima; +Cc: davem, edumazet, kuba, pabeni, horms, kuni1840, netdev
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Sat, 15 Nov 2025 02:08:31 +0000 you wrote:
> Currently, AF_UNIX GC is triggered from close() and sendmsg()
> based on the number of inflight AF_UNIX sockets.
>
> This is because the old GC implementation had no idea of the
> shape of the graph formed by SCM_RIGHTS references.
>
> The new GC knows whether cyclic references (could) exist.
>
> [...]
Here is the summary with links:
- [v1,net-next,1/7] af_unix: Count cyclic SCC.
https://git.kernel.org/netdev/net-next/c/58b47c713711
- [v1,net-next,2/7] af_unix: Simplify GC state.
https://git.kernel.org/netdev/net-next/c/6b6f3c71fe56
- [v1,net-next,3/7] af_unix: Don't trigger GC from close() if unnecessary.
https://git.kernel.org/netdev/net-next/c/da8fc7a39be8
- [v1,net-next,4/7] af_unix: Don't call wait_for_unix_gc() on every sendmsg().
https://git.kernel.org/netdev/net-next/c/384900542dc8
- [v1,net-next,5/7] af_unix: Refine wait_for_unix_gc().
https://git.kernel.org/netdev/net-next/c/e29c7a4cec86
- [v1,net-next,6/7] af_unix: Remove unix_tot_inflight.
https://git.kernel.org/netdev/net-next/c/ab8b23150abc
- [v1,net-next,7/7] af_unix: Consolidate unix_schedule_gc() and wait_for_unix_gc().
https://git.kernel.org/netdev/net-next/c/24fa77dad25c
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 9+ messages in thread