[PATCH 0/2] af_unix: Fix priority inversion issue

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

* [PATCH 0/2] af_unix: Fix priority inversion issue
@ 2026-07-01 16:35 Nam Cao
  2026-07-01 16:35 ` [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg() Nam Cao
  2026-07-01 16:35 ` [PATCH 2/2] af_unix: Clean up unix_schedule_gc() Nam Cao
  0 siblings, 2 replies; 9+ messages in thread
From: Nam Cao @ 2026-07-01 16:35 UTC (permalink / raw)
  To: Kuniyuki Iwashima, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, linux-kernel, linux-rt-devel
  Cc: Nam Cao

Hi,

While auditing AF_UNIX sockets, I noticed that the sendmsg() code paths may
block on the garbage collector running as workqueue. This can cause
priority inversion and latency for real-time users.

The implementation does kindly avoid blocking "sane users". However, it is
impossible to tell whether the kernel's definition of "sane users"
accurately describes all users out there.

Digging into history and figuring out the reasons why sendmsg() needs to
wait for garbage collector, it is determined that those reasons no longer
apply.

The first patch remove that block, and the second patch is a simple
post cleanup.

Nam Cao (2):
  af_unix: Do not wait for garbage collector in sendmsg()
  af_unix: Clean up unix_schedule_gc()

 net/unix/af_unix.c |  2 +-
 net/unix/af_unix.h |  2 +-
 net/unix/garbage.c | 16 +---------------
 3 files changed, 3 insertions(+), 17 deletions(-)

-- 
2.47.3

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-01 16:35 [PATCH 0/2] af_unix: Fix priority inversion issue Nam Cao
@ 2026-07-01 16:35 ` Nam Cao
  2026-07-02  3:27   ` Kuniyuki Iwashima
  2026-07-02 16:36   ` sashiko-bot
  2026-07-01 16:35 ` [PATCH 2/2] af_unix: Clean up unix_schedule_gc() Nam Cao
  1 sibling, 2 replies; 9+ messages in thread
From: Nam Cao @ 2026-07-01 16:35 UTC (permalink / raw)
  To: Kuniyuki Iwashima, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, linux-kernel, linux-rt-devel
  Cc: Nam Cao

AF_UNIX sockets' sendmsg() schedules and blocks on the garbage collector if
user has too many inflight unix sockets. This causes real-time issues, as
high priority processes who do need to send lots of unix sockets get
blocked by the garbage collector which runs as workqueue, causing a
priority inversion scenario.

The reason for blocking on garbage collector goes back to 2008, when
it was reported that "Local/unprivileged users can cause soft lockups
and take out system processes by triggering the OOM killer":
https://bugzilla.redhat.com/show_bug.cgi?id=470201

The soft lockup was because a process can keep queueing AF_UNIX sockets to
another process that is exiting. Back in 2008, the garbage collector was
run synchronously by the exiting process, therefore keep queueing AF_UNIX
sockets blocks that process from exiting.

The solution to that issue was forcing sendmsg() to wait for ongoing
garbage collector.

The OOM killer issue was brought up again in 2010:
https://lore.kernel.org/lkml/AANLkTi=Q967xpX0KLMwX-=_4_1AKO5wjHEuJ1TrNjCj9@mail.gmail.com/

To resolve that report, beside blocking on the garbage collector, sendmsg()
also schedules the garbage collector if the number of inflight AF_UNIX
sockets in the system is too high.

Then in 2015, once again, the OOM killer problem was brought up:
https://lore.kernel.org/lkml/20151228141435.GA13351@1wt.eu/

That time, the issue was resolved by disallowing a user from having more
inflight AF_UNIX sockets than their RLIMIT_NOFILE. That was done by commit
712f4aad406b ("unix: properly account for FDs passed over unix sockets")
and commit 415e3d3e90ce ("unix: correctly track in-flight fds in sending
process user_struct").

Now, sendmsg() does not have to block on the garbage collector anymore,
because:

  - The OOM killer issue has already been addressed by checking
    RLIMIT_NOFILE.

  - The soft lockup issue is no longer relevant, because the garbage
    collector now runs asynchronously since commit d9f21b361333 ("af_unix:
    Try to run GC async.")

Therefore, remove that to prevent priority inversion. Running all the
reproducers from the mentioned bug reports after this patch, no problem is
observed.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 net/unix/garbage.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index 0783555e2526..f180c59b3da9 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
 	if (!fpl->edges)
 		goto err;

-	unix_schedule_gc(fpl->user);
-
 	return 0;

 err:
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] af_unix: Clean up unix_schedule_gc()
  2026-07-01 16:35 [PATCH 0/2] af_unix: Fix priority inversion issue Nam Cao
  2026-07-01 16:35 ` [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg() Nam Cao
@ 2026-07-01 16:35 ` Nam Cao
  1 sibling, 0 replies; 9+ messages in thread
From: Nam Cao @ 2026-07-01 16:35 UTC (permalink / raw)
  To: Kuniyuki Iwashima, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, linux-kernel, linux-rt-devel
  Cc: Nam Cao

The only caller of unix_schedule_gc() passes NULL as an argument. Therefore
simplify by deleting the parameter.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
 net/unix/af_unix.c |  2 +-
 net/unix/af_unix.h |  2 +-
 net/unix/garbage.c | 14 +-------------
 3 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index f7a9d55eee8a..759db734a866 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -733,7 +733,7 @@ static void unix_release_sock(struct sock *sk, int embrion)
 
 	/* ---- Socket is dead now and most probably destroyed ---- */
 
-	unix_schedule_gc(NULL);
+	unix_schedule_gc();
 }
 
 struct unix_peercred {
diff --git a/net/unix/af_unix.h b/net/unix/af_unix.h
index 8119dbeef3a3..600d56fdcde4 100644
--- a/net/unix/af_unix.h
+++ b/net/unix/af_unix.h
@@ -30,7 +30,7 @@ void unix_update_edges(struct unix_sock *receiver);
 int unix_prepare_fpl(struct scm_fp_list *fpl);
 void unix_destroy_fpl(struct scm_fp_list *fpl);
 void unix_peek_fpl(struct scm_fp_list *fpl);
-void unix_schedule_gc(struct user_struct *user);
+void unix_schedule_gc(void);
 
 /* SOCK_DIAG */
 long unix_inq_len(struct sock *sk);
diff --git a/net/unix/garbage.c b/net/unix/garbage.c
index f180c59b3da9..d46aeb9d2051 100644
--- a/net/unix/garbage.c
+++ b/net/unix/garbage.c
@@ -635,23 +635,11 @@ static void unix_gc(struct work_struct *work)
 
 static DECLARE_WORK(unix_gc_work, unix_gc);
 
-#define UNIX_INFLIGHT_SANE_USER		(SCM_MAX_FD * 8)
-
-void unix_schedule_gc(struct user_struct *user)
+void unix_schedule_gc(void)
 {
 	if (READ_ONCE(unix_graph_state) == UNIX_GRAPH_NOT_CYCLIC)
 		return;
 
-	/* Penalise users who want to send AF_UNIX sockets
-	 * but whose sockets have not been received yet.
-	 */
-	if (user &&
-	    READ_ONCE(user->unix_inflight) < UNIX_INFLIGHT_SANE_USER)
-		return;
-
 	if (!READ_ONCE(gc_in_progress))
 		queue_work(system_dfl_wq, &unix_gc_work);
-
-	if (user && READ_ONCE(unix_graph_cyclic_sccs))
-		flush_work(&unix_gc_work);
 }
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-01 16:35 ` [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg() Nam Cao
@ 2026-07-02  3:27   ` Kuniyuki Iwashima
  2026-07-02  3:56     ` Nam Cao
  2026-07-02 16:36   ` sashiko-bot
  1 sibling, 1 reply; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-07-02  3:27 UTC (permalink / raw)
  To: Nam Cao
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev, linux-kernel, linux-rt-devel

On Wed, Jul 1, 2026 at 9:36 AM Nam Cao <namcao@linutronix.de> wrote:
>
> AF_UNIX sockets' sendmsg() schedules and blocks on the garbage collector if
> user has too many inflight unix sockets

AND there is a cyclic reference

>. This causes real-time issues, as
> high priority processes who do need to send lots of unix sockets get
> blocked by the garbage collector which runs as workqueue, causing a
> priority inversion scenario.

So the real problem is the process creating cyclic references.

>
> The reason for blocking on garbage collector goes back to 2008, when
> it was reported that "Local/unprivileged users can cause soft lockups
> and take out system processes by triggering the OOM killer":
> https://bugzilla.redhat.com/show_bug.cgi?id=470201
>
> The soft lockup was because a process can keep queueing AF_UNIX sockets to
> another process that is exiting. Back in 2008, the garbage collector was
> run synchronously by the exiting process, therefore keep queueing AF_UNIX
> sockets blocks that process from exiting.
>
> The solution to that issue was forcing sendmsg() to wait for ongoing
> garbage collector.
>
> The OOM killer issue was brought up again in 2010:
> https://lore.kernel.org/lkml/AANLkTi=Q967xpX0KLMwX-=_4_1AKO5wjHEuJ1TrNjCj9@mail.gmail.com/
>
> To resolve that report, beside blocking on the garbage collector, sendmsg()
> also schedules the garbage collector if the number of inflight AF_UNIX
> sockets in the system is too high.
>
> Then in 2015, once again, the OOM killer problem was brought up:
> https://lore.kernel.org/lkml/20151228141435.GA13351@1wt.eu/
>
> That time, the issue was resolved by disallowing a user from having more
> inflight AF_UNIX sockets than their RLIMIT_NOFILE. That was done by commit
> 712f4aad406b ("unix: properly account for FDs passed over unix sockets")
> and commit 415e3d3e90ce ("unix: correctly track in-flight fds in sending
> process user_struct").
>
> Now, sendmsg() does not have to block on the garbage collector anymore,
> because:
>
>   - The OOM killer issue has already been addressed by checking
>     RLIMIT_NOFILE.
>
>   - The soft lockup issue is no longer relevant, because the garbage
>     collector now runs asynchronously since commit d9f21b361333 ("af_unix:
>     Try to run GC async.")

I don't think the latter is resolved.  Without blocking insane
users, they can keep pushing sockets to the kernel work,
which could be soft-lockup'd.


>
> Therefore, remove that to prevent priority inversion. Running all the
> reproducers from the mentioned bug reports after this patch, no problem is
> observed.
>
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> ---
>  net/unix/garbage.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> index 0783555e2526..f180c59b3da9 100644
> --- a/net/unix/garbage.c
> +++ b/net/unix/garbage.c
> @@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
>         if (!fpl->edges)
>                 goto err;
>
> -       unix_schedule_gc(fpl->user);
> -
>         return 0;
>
>  err:
> --
> 2.47.3
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-02  3:27   ` Kuniyuki Iwashima
@ 2026-07-02  3:56     ` Nam Cao
  0 siblings, 0 replies; 9+ messages in thread
From: Nam Cao @ 2026-07-02  3:56 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, netdev, linux-kernel, linux-rt-devel

Kuniyuki Iwashima <kuniyu@google.com> writes:
> On Wed, Jul 1, 2026 at 9:36 AM Nam Cao <namcao@linutronix.de> wrote:
>> Now, sendmsg() does not have to block on the garbage collector anymore,
>> because:
>>
>>   - The OOM killer issue has already been addressed by checking
>>     RLIMIT_NOFILE.
>>
>>   - The soft lockup issue is no longer relevant, because the garbage
>>     collector now runs asynchronously since commit d9f21b361333 ("af_unix:
>>     Try to run GC async.")
>
> I don't think the latter is resolved.  Without blocking insane
> users, they can keep pushing sockets to the kernel work,
> which could be soft-lockup'd.

User cannot push more than RLIMIT_NOFILE before GC runs. And the GC
grabs the spin lock, clean up the present stuffs, and exit.

So user could make the GC runs again and again, but there wouldn't be
soft lockup, as the GC yields after short intervals.

Nam

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-01 16:35 ` [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg() Nam Cao
  2026-07-02  3:27   ` Kuniyuki Iwashima
@ 2026-07-02 16:36   ` sashiko-bot
  2026-07-03  5:53     ` Nam Cao
  1 sibling, 1 reply; 9+ messages in thread
From: sashiko-bot @ 2026-07-02 16:36 UTC (permalink / raw)
  To: Nam Cao; +Cc: linux-rt-devel, linux-kernel

Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [High] Garbage collector is no longer scheduled when creating new inflight FDs, leading to permanent cycle accumulation and -ETOOMANYREFS failures in isolated environments.
--

commit 506189bd29f8ad56f5d9e51f3e0216203ebad5ec
Author: Nam Cao <namcao@linutronix.de>

af_unix: Do not wait for garbage collector in sendmsg()

This patch removes the garbage collector scheduling and waiting from the
sendmsg() path for AF_UNIX sockets. It aims to prevent priority inversion
by relying on RLIMIT_NOFILE limits and asynchronous garbage collection.

> diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> index 0783555e25266..f180c59b3da9d 100644
> --- a/net/unix/garbage.c
> +++ b/net/unix/garbage.c
> @@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
>  	if (!fpl->edges)
>  		goto err;
>  
> -	unix_schedule_gc(fpl->user);
> -

[Severity: High]
Does this removal prevent the garbage collector from running when a process
creates isolated socket cycles?

If a process continuously creates socket cycles (e.g., passing FDs to each
other) and closes its local FDs, the sockets reference each other in their
receive queues, so their file reference counts will not reach zero.

Because the refcounts do not drop to zero, unix_release_sock() is never
triggered. By removing unix_schedule_gc() from unix_prepare_fpl() here,
unix_release_sock() is left as the only trigger for garbage collection.

This could allow the inflight count to accumulate indefinitely until it
hits RLIMIT_NOFILE, causing subsequent sendmsg() calls to fail with
-ETOOMANYREFS and permanently leaking the network namespace unless an
unrelated AF_UNIX socket is closed system-wide.

>  	return 0;
>  
>  err:

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1782922354.git.namcao@linutronix.de?part=1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-02 16:36   ` sashiko-bot
@ 2026-07-03  5:53     ` Nam Cao
  2026-07-03  6:25       ` Kuniyuki Iwashima
  0 siblings, 1 reply; 9+ messages in thread
From: Nam Cao @ 2026-07-03  5:53 UTC (permalink / raw)
  To: sashiko-reviews, Kuniyuki Iwashima, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
	linux-kernel, linux-rt-devel
  Cc: linux-rt-devel, linux-kernel

sashiko-bot@kernel.org writes:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> - [High] Garbage collector is no longer scheduled when creating new inflight FDs, leading to permanent cycle accumulation and -ETOOMANYREFS failures in isolated environments.
> --
>
> commit 506189bd29f8ad56f5d9e51f3e0216203ebad5ec
> Author: Nam Cao <namcao@linutronix.de>
>
> af_unix: Do not wait for garbage collector in sendmsg()
>
> This patch removes the garbage collector scheduling and waiting from the
> sendmsg() path for AF_UNIX sockets. It aims to prevent priority inversion
> by relying on RLIMIT_NOFILE limits and asynchronous garbage collection.
>
>> diff --git a/net/unix/garbage.c b/net/unix/garbage.c
>> index 0783555e25266..f180c59b3da9d 100644
>> --- a/net/unix/garbage.c
>> +++ b/net/unix/garbage.c
>> @@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
>>  	if (!fpl->edges)
>>  		goto err;
>>  
>> -	unix_schedule_gc(fpl->user);
>> -
>
> [Severity: High]
> Does this removal prevent the garbage collector from running when a process
> creates isolated socket cycles?
>
> If a process continuously creates socket cycles (e.g., passing FDs to each
> other) and closes its local FDs, the sockets reference each other in their
> receive queues, so their file reference counts will not reach zero.
>
> Because the refcounts do not drop to zero, unix_release_sock() is never
> triggered. By removing unix_schedule_gc() from unix_prepare_fpl() here,
> unix_release_sock() is left as the only trigger for garbage collection.
>
> This could allow the inflight count to accumulate indefinitely until it
> hits RLIMIT_NOFILE, causing subsequent sendmsg() calls to fail with
> -ETOOMANYREFS and permanently leaking the network namespace unless an
> unrelated AF_UNIX socket is closed system-wide.

Sashiko found a valid issue here. But this is a pre-existing issue. It
is not introduced in this patch.

unix_schedule_gc() does not actually schedule the garbage collector
unless user's inflight unix socket count exceeds UNIX_INFLIGHT_SANE_USER
(2024). So user can already accumulate up to 2023 inflight socket counts
without the garbage collector running until unix_release_sock() is called.

Nam

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-03  5:53     ` Nam Cao
@ 2026-07-03  6:25       ` Kuniyuki Iwashima
  2026-07-04  6:03         ` Nam Cao
  0 siblings, 1 reply; 9+ messages in thread
From: Kuniyuki Iwashima @ 2026-07-03  6:25 UTC (permalink / raw)
  To: Nam Cao
  Cc: sashiko-reviews, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, linux-kernel, linux-rt-devel

On Thu, Jul 2, 2026 at 10:53 PM Nam Cao <namcao@linutronix.de> wrote:
>
> sashiko-bot@kernel.org writes:
> > Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> > - [High] Garbage collector is no longer scheduled when creating new inflight FDs, leading to permanent cycle accumulation and -ETOOMANYREFS failures in isolated environments.
> > --
> >
> > commit 506189bd29f8ad56f5d9e51f3e0216203ebad5ec
> > Author: Nam Cao <namcao@linutronix.de>
> >
> > af_unix: Do not wait for garbage collector in sendmsg()
> >
> > This patch removes the garbage collector scheduling and waiting from the
> > sendmsg() path for AF_UNIX sockets. It aims to prevent priority inversion
> > by relying on RLIMIT_NOFILE limits and asynchronous garbage collection.
> >
> >> diff --git a/net/unix/garbage.c b/net/unix/garbage.c
> >> index 0783555e25266..f180c59b3da9d 100644
> >> --- a/net/unix/garbage.c
> >> +++ b/net/unix/garbage.c
> >> @@ -300,8 +300,6 @@ int unix_prepare_fpl(struct scm_fp_list *fpl)
> >>      if (!fpl->edges)
> >>              goto err;
> >>
> >> -    unix_schedule_gc(fpl->user);
> >> -
> >
> > [Severity: High]
> > Does this removal prevent the garbage collector from running when a process
> > creates isolated socket cycles?
> >
> > If a process continuously creates socket cycles (e.g., passing FDs to each
> > other) and closes its local FDs, the sockets reference each other in their
> > receive queues, so their file reference counts will not reach zero.
> >
> > Because the refcounts do not drop to zero, unix_release_sock() is never
> > triggered. By removing unix_schedule_gc() from unix_prepare_fpl() here,
> > unix_release_sock() is left as the only trigger for garbage collection.
> >
> > This could allow the inflight count to accumulate indefinitely until it
> > hits RLIMIT_NOFILE, causing subsequent sendmsg() calls to fail with
> > -ETOOMANYREFS and permanently leaking the network namespace unless an
> > unrelated AF_UNIX socket is closed system-wide.
>
> Sashiko found a valid issue here.

This is same with my point, and

> But this is a pre-existing issue. It
> is not introduced in this patch.

your patch makes it much easier to abuse.
UNIX_INFLIGHT_SANE_USER is usually much smaller than
RLIMIT_NOFILE.

unix_schedule_gc() in sendmsg() is to self-regulate malicious users,
otherwise GC relies on unrelated AF_UNIX socket's close() and could
be triggered too late since GC is system-wide.

Previously every sendmsg() had to wait for GC, and now it's only when
there is a circular reference AND user has too many inflight sockets.

Please fix the root cause; the former condition on your system.
--
pw-bot: cr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg()
  2026-07-03  6:25       ` Kuniyuki Iwashima
@ 2026-07-04  6:03         ` Nam Cao
  0 siblings, 0 replies; 9+ messages in thread
From: Nam Cao @ 2026-07-04  6:03 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: sashiko-reviews, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, netdev, linux-kernel, linux-rt-devel

Kuniyuki Iwashima <kuniyu@google.com> writes:
> your patch makes it much easier to abuse.
> UNIX_INFLIGHT_SANE_USER is usually much smaller than
> RLIMIT_NOFILE.
>
> unix_schedule_gc() in sendmsg() is to self-regulate malicious users,
> otherwise GC relies on unrelated AF_UNIX socket's close() and could
> be triggered too late since GC is system-wide.

About the abuse, the scenario where inflight sockets bypass
UNIX_INFLIGHT_SANE_USER and delay GC until an unrelated AF_UNIX socket
closes actually exists today.

For example, the following program creates far more than
UNIX_INFLIGHT_SANE_USER inflight sockets, which persists indefinitely
until another unrelated AF_UNIX close().

#include <sys/mount.h>
#include <sys/socket.h>
#include <sys/un.h>
#include <sys/wait.h>

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static int send_fd(int unix_fd, int fd)
{
        struct msghdr msgh;
        struct cmsghdr *cmsg;
        char buf[CMSG_SPACE(sizeof(fd))];

        memset(&msgh, 0, sizeof(msgh));

        memset(buf, 0, sizeof(buf));
        msgh.msg_control = buf;
        msgh.msg_controllen = sizeof(buf);

        cmsg = CMSG_FIRSTHDR(&msgh);
        cmsg->cmsg_len = CMSG_LEN(sizeof(fd));
        cmsg->cmsg_level = SOL_SOCKET;
        cmsg->cmsg_type = SCM_RIGHTS;

        msgh.msg_controllen = cmsg->cmsg_len;

        memcpy(CMSG_DATA(cmsg), &fd, sizeof(fd));
        return sendmsg(unix_fd, &msgh, 0);
}

int main(int argc, char *argv[])
{
	int fd[2];
	int i;

	for (int n = 0; n < 100; ++n) {
		if (socketpair(PF_UNIX, SOCK_SEQPACKET, 0, fd) == -1)
			goto out_error;

		for (i = 0; i < 100; ++i) {
			if (send_fd(fd[0], fd[0]) == -1)
				goto out_error;

			if (send_fd(fd[1], fd[1]) == -1)
				goto out_error;
		}
	}

	return 0;

out_error:
	fprintf(stderr, "error: %s\n", strerror(errno));
}

To address this properly, we can schedule the GC at task exit. I can
include that patch in my series, if that sounds good to you.

> Previously every sendmsg() had to wait for GC, and now it's only when
> there is a circular reference AND user has too many inflight sockets.
>
> Please fix the root cause; the former condition on your system.

Can you clarify what you mean by fixing the former condition on my
system. Do you mean ensuring that no application creates a circular
reference?

I am afraid we cannot rely on all users and applications to behave. We
do not want a buggy program or a malicious program to harm another
time-critical task.

Nam

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-07-04  6:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 16:35 [PATCH 0/2] af_unix: Fix priority inversion issue Nam Cao
2026-07-01 16:35 ` [PATCH 1/2] af_unix: Do not wait for garbage collector in sendmsg() Nam Cao
2026-07-02  3:27   ` Kuniyuki Iwashima
2026-07-02  3:56     ` Nam Cao
2026-07-02 16:36   ` sashiko-bot
2026-07-03  5:53     ` Nam Cao
2026-07-03  6:25       ` Kuniyuki Iwashima
2026-07-04  6:03         ` Nam Cao
2026-07-01 16:35 ` [PATCH 2/2] af_unix: Clean up unix_schedule_gc() Nam Cao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox