Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
       [not found] <F1E9B1BEBD8867CA+092eea18-94c7-4c65-a466-95cd3628a88c@uniontech.com>
@ 2026-02-21  7:11 ` syzbot
  2026-02-21  7:28   ` Ding Yihan
  0 siblings, 1 reply; 27+ messages in thread
From: syzbot @ 2026-02-21  7:11 UTC (permalink / raw)
  To: dingyihan, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
Tested-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com

Tested on:

commit:         d4906ae1 Add linux-next specific files for 20260220
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=13ea89e6580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=51f859f3211496bc
dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch:          https://syzkaller.appspot.com/x/patch.diff?x=15f0595a580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-21  7:11 ` [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback syzbot
@ 2026-02-21  7:28   ` Ding Yihan
  2026-02-21 12:00     ` Günther Noack
  2026-02-24 14:43     ` [syzbot] [kernel?] INFO: task hung " Günther Noack
  0 siblings, 2 replies; 27+ messages in thread
From: Ding Yihan @ 2026-02-21  7:28 UTC (permalink / raw)
  To: syzbot, Mickaël Salaün
  Cc: linux-security-module, Günther Noack

Hi all,

Thanks to syzbot for the testing and confirmation.

Since I am relatively new to the inner workings of this specific subsystem, 
I would like to take a few days to thoroughly study the root cause 
(the task_work and mutex interaction) and prepare a detailed and proper commit message. 

I will send out the formal patch (v1) to the mailing list later.

Best regards,
Yihan Ding

在 2026/2/21 15:11, syzbot 写道:
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> 
> Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
> Tested-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         d4906ae1 Add linux-next specific files for 20260220
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=13ea89e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=51f859f3211496bc
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=15f0595a580000
> 
> Note: testing is done by a robot and is best-effort only.
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-21  7:28   ` Ding Yihan
@ 2026-02-21 12:00     ` Günther Noack
  2026-02-21 13:19       ` Günther Noack
  2026-02-24 14:43     ` [syzbot] [kernel?] INFO: task hung " Günther Noack
  1 sibling, 1 reply; 27+ messages in thread
From: Günther Noack @ 2026-02-21 12:00 UTC (permalink / raw)
  To: Ding Yihan
  Cc: syzbot, Mickaël Salaün, linux-security-module,
	Jann Horn

Hello Ding!

On Sat, Feb 21, 2026 at 03:28:47PM +0800, Ding Yihan wrote:
> Since I am relatively new to the inner workings of this specific subsystem, 
> I would like to take a few days to thoroughly study the root cause 
> (the task_work and mutex interaction) and prepare a detailed and proper commit message. 
> 
> I will send out the formal patch (v1) to the mailing list later.

Thank you very much for preparing a patch, and especially also for
forwarding this to us.  (The original syzkaller report was somehow not
addressed to Landlock or the LSM list.  We should fix that.)

Timing wise, the feature was picked up for the 7.0 release, so we
still have some time to fix it before this is stable.

As an early review for the patch:

Background:

We had previously convinced ourselves that grabbing the
cred_guard_mutex was not necessary.  To quote the comment in
landlock_restrict_sibling_threads():

    Unlike seccomp, which modifies sibling tasks directly, we do not need to
    acquire the cred_guard_mutex and sighand->siglock:

    - As in our case, all threads are themselves exchanging their own struct
      cred through the credentials API, no locks are needed for that.
    - Our for_each_thread() loops are protected by RCU.
    - We do not acquire a lock to keep the list of sibling threads stable
      between our for_each_thread loops.  If the list of available sibling
      threads changes between these for_each_thread loops, we make up for
      that by continuing to look for threads until they are all discovered
      and have entered their task_work, where they are unable to spawn new
      threads.

The question of locking cred_guard_mutex came up in the patch
discussion multiple times as well, the most recent discussion was:
https://lore.kernel.org/all/20251020.fohbo6Iecahz@digikod.net/

If it helps, I keep some of my own notes for this particular feature
on https://wiki.gnoack.org/LandlockMultithreadedEnforcement.

(Very) tentative investigation:

In the Syzkaller report [2], it seems that the reproducer [2.1] is
creating two rulesets and then enforcing them in parallel, a scenario
which we are exercising in the TEST(competing_enablement) in
tools/testing/selftests/landlock/tsync_test.c already, but which has
not failed in my own selftest runs.

In the crash report, there are four threads in total:

* Two are stuck in the line
  wait_for_completion(&ctx->ready_to_commit);
  in the per-thread task work (line 128 [4.1])
* Two are stuck in the line
  wait_for_completion(&shared_ctx.all_prepared)
  in the calling thread's coordination logic (line 539 [4.2])

In line 539, we are already on the code path where we detected that we
are getting interrupted by another thread and where we are attempting
to deal with the scenario where two landlock_restrict_self() calls
compete.  This is detected on line 523 when
wait_for_completion_interruptible() is true.  The approach to handle
this is to set the overall -ERESTARTNOINTR error and cancel the work
that has been ongoing so far, by canceling the task works that did not
start running yet and waiting for the ones that did start running
(that is the step where we are blocked!).  The reasoning there was
that these task works will all hit the "all_prepared" stage now, but
as we can see in the stack trace, the task works that are actively
running are already on line 128 and have passed the "all_prepared"
stage).

Differences I can see between syzkaller and our own test:

* The reproducer also calls openat() and then twice socketpair().
  These syscalls should be unrelated, but it's possible that the
  "async" invocation of socketpair() contributes to adding more
  threads. (Assuming that "async" means "in new thread" in syzkaller)
* Syzkaller gives it more attempts. ([2.2])

I do not understand yet what went wrong in our scheme and need to look
deeper.

Ding, do you have more insights into it from your debugging?

Thanks,
–Günther

For reference:

[1] Report Mail: https://lore.kernel.org/all/69984159.050a0220.21cd75.01bb.GAE@google.com/
[2] Report: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
  [2.1] Reproducer: https://syzkaller.appspot.com/text?tag=ReproSyz&x=16e41c02580000
  [2.2] Reproducer (C): https://syzkaller.appspot.com/text?tag=ReproC&x=15813652580000
[3] Patch: https://lore.kernel.org/all/6999504d.a70a0220.2c38d7.0154.GAE@google.com/
[4.1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/security/landlock/tsync.c?id=635c467cc14ebdffab3f77610217c1dacaf88e8c#n128
[4.2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/security/landlock/tsync.c?id=635c467cc14ebdffab3f77610217c1dacaf88e8c#n539

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-21 12:00     ` Günther Noack
@ 2026-02-21 13:19       ` Günther Noack
  2026-02-23  9:42         ` Günther Noack
  0 siblings, 1 reply; 27+ messages in thread
From: Günther Noack @ 2026-02-21 13:19 UTC (permalink / raw)
  To: Ding Yihan
  Cc: syzbot, Mickaël Salaün, linux-security-module,
	Jann Horn

On Sat, Feb 21, 2026 at 01:00:03PM +0100, Günther Noack wrote:
> (Very) tentative investigation:
> 
> In the Syzkaller report [2], it seems that the reproducer [2.1] is
> creating two rulesets and then enforcing them in parallel, a scenario
> which we are exercising in the TEST(competing_enablement) in
> tools/testing/selftests/landlock/tsync_test.c already, but which has
> not failed in my own selftest runs.
> 
> In the crash report, there are four threads in total:
> 
> * Two are stuck in the line
>   wait_for_completion(&ctx->ready_to_commit);
>   in the per-thread task work (line 128 [4.1])
> * Two are stuck in the line
>   wait_for_completion(&shared_ctx.all_prepared)
>   in the calling thread's coordination logic (line 539 [4.2])
> 
> In line 539, we are already on the code path where we detected that we
> are getting interrupted by another thread and where we are attempting
> to deal with the scenario where two landlock_restrict_self() calls
> compete.  This is detected on line 523 when
> wait_for_completion_interruptible() is true.  The approach to handle
> this is to set the overall -ERESTARTNOINTR error and cancel the work
> that has been ongoing so far, by canceling the task works that did not
> start running yet and waiting for the ones that did start running
> (that is the step where we are blocked!).  The reasoning there was
> that these task works will all hit the "all_prepared" stage now, but
> as we can see in the stack trace, the task works that are actively
> running are already on line 128 and have passed the "all_prepared"
> stage).
> 
> Differences I can see between syzkaller and our own test:
> 
> * The reproducer also calls openat() and then twice socketpair().
>   These syscalls should be unrelated, but it's possible that the
>   "async" invocation of socketpair() contributes to adding more
>   threads. (Assuming that "async" means "in new thread" in syzkaller)
> * Syzkaller gives it more attempts. ([2.2])
> 
> I do not understand yet what went wrong in our scheme and need to look
> deeper.

OK, I think I understand now.  Our existing recovery code for this
conflict is this:

/*
 * Decrement num_preparing for current, to undo that we initialized it
 * to 1 a few lines above.
 */
if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
	if (wait_for_completion_interruptible(
		    &shared_ctx.all_prepared)) {
		/* In case of interruption, we need to retry the system call. */
		atomic_set(&shared_ctx.preparation_error,
			   -ERESTARTNOINTR);

		/*
		 * Cancel task works for tasks that did not start running yet,
		 * and decrement all_prepared and num_unfinished accordingly.
		 */
		cancel_tsync_works(&works, &shared_ctx);

		/*
		 * The remaining task works have started running, so waiting for
		 * their completion will finish.
		 */
		wait_for_completion(&shared_ctx.all_prepared);
	}
}

When I wrote this, I assumed, as the last comment states, that the
task works which we could not cancel, are already running.

I was wrong there, because I had misunderstood task_work_run().  When
the task works get run there, it first *atomically dequeues the entire
queue of scheduled task works*, and then runs them sequentially.

That is why, if we have one task work that belongs to the first
landlock_restrict_self() call and one which belongs to the other, the
task work which is scheduled later can (a) not be dequeued with
cancel_tsync_works() any more, and (b) also has not started running
yet.

Now the only thing that is necessary to produce the deadlock is that
we have a pair of threads where the task works for the restriction
calls have been scheduled in different order.  When the two
landlock_restrict_self() calls end up in the recovery path quoted
above, they will wait for one of their task works to run which is
blocked from running by another task work that is scheduled before and
does not finish either.

(Just pasting a brain dump here to save you some time hunting for the
root cause. I don't know the best solution yet either.)

–Günther

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-21 13:19       ` Günther Noack
@ 2026-02-23  9:42         ` Günther Noack
  2026-02-23 11:29           ` Ding Yihan
  2026-02-24  6:27           ` [PATCH] landlock: Fix deadlock " Yihan Ding
  0 siblings, 2 replies; 27+ messages in thread
From: Günther Noack @ 2026-02-23  9:42 UTC (permalink / raw)
  To: Ding Yihan
  Cc: syzbot, Mickaël Salaün, linux-security-module,
	Jann Horn, Paul Moore

On Sat, Feb 21, 2026 at 02:19:53PM +0100, Günther Noack wrote:
> OK, I think I understand now.  Our existing recovery code for this
> conflict is this:
> 
> /*
>  * Decrement num_preparing for current, to undo that we initialized it
>  * to 1 a few lines above.
>  */
> if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
> 	if (wait_for_completion_interruptible(
> 		    &shared_ctx.all_prepared)) {
> 		/* In case of interruption, we need to retry the system call. */
> 		atomic_set(&shared_ctx.preparation_error,
> 			   -ERESTARTNOINTR);
> 
> 		/*
> 		 * Cancel task works for tasks that did not start running yet,
> 		 * and decrement all_prepared and num_unfinished accordingly.
> 		 */
> 		cancel_tsync_works(&works, &shared_ctx);
> 
> 		/*
> 		 * The remaining task works have started running, so waiting for
> 		 * their completion will finish.
> 		 */
> 		wait_for_completion(&shared_ctx.all_prepared);
> 	}
> }
> 
> When I wrote this, I assumed, as the last comment states, that the
> task works which we could not cancel, are already running.
> 
> I was wrong there, because I had misunderstood task_work_run().  When
> the task works get run there, it first *atomically dequeues the entire
> queue of scheduled task works*, and then runs them sequentially.
> 
> That is why, if we have one task work that belongs to the first
> landlock_restrict_self() call and one which belongs to the other, the
> task work which is scheduled later can (a) not be dequeued with
> cancel_tsync_works() any more, and (b) also has not started running
> yet.
> 
> Now the only thing that is necessary to produce the deadlock is that
> we have a pair of threads where the task works for the restriction
> calls have been scheduled in different order.  When the two
> landlock_restrict_self() calls end up in the recovery path quoted
> above, they will wait for one of their task works to run which is
> blocked from running by another task work that is scheduled before and
> does not finish either.
> 
> (Just pasting a brain dump here to save you some time hunting for the
> root cause. I don't know the best solution yet either.)

Let me propose the following fixes:

1. Immediate fix for that specific issue
----------------------------------------

Proposal:
* Remove the wait_for_completion(&shared_ctx.all_prepared)
  call in the code snippet above.
* Rewrite surrounding comments: Be clear about the fact that
  cancel_tsync_works() is an opportunistic improvement, but we don't
  have a guarantee at all that it cancels any of the enqueued task
  works (because task_work_run might already have popped them off).

This removes the hold-and-wait dependency circle between the threads,
which produces the observed deadlock.  The way that we shut down now
is that we exit the main loop (happens already without it, but we
might also "break" to be explicit).

I think that this fix or an equivalent one is needed here, because in
either way, our assumptions in the quoted code above were wrong.

2. Can we reason constructively about correctness?
--------------------------------------------------

The remaining question: If on the shutdown path, we can not actually
remove all the enqueued task works, under what circumstances are we
even able to interrupt and return from the landlock_restrict_self()
system call?

2.1 For n competing restrict_self calls, n-1 of them need to get interrupted
----------------------------------------------------------------------------

To answer this, consider a multithreaded process with threads named
"red", "green" and "blue" and many additional threads: When "red",
"green" and "blue" enforce landlock_restrict_self() concurrently, due
to differing iteration order, we might end up enqueueing the task
works on other threads in all of the following combinations:

  t0:  R G B  <- front of queue
  t1:  R B G
  t2:  G R B
  t3:  G B R
  t4:  B R G
  t5:  B G R

In this configuration, for any of the landlock_restrict_self() system
calls to even return (successfully or unsuccessfully), at least two
threads must receive an interrupt and therefore remove their enqueued
task works from the front of the queue.  Assuming those are green and
blue, we get:

  t0:  R      <- front of queue
  t1:  R
  t2:  G R
  t3:  G B R
  t4:  B R
  t5:  B G R

(This works because after the patch above, all of the enqueued G and B
works finish even if there are remaining G and B works that are still
blocked by an "R" entry.)

Now, "R" is in the front of the queue, and the
landlock_restrict_self() call for the red thread can finish normally,
even without it being interrupted.

Once the "R" task works are done as well, the remaining G and B works
can run and finish as well.

This scheme generalizes: If we have n competing
landlock_restrict_self() calls, then in worst case, at least n-1 of
these system calls need to be interrupted so that they can all
terminate.

2.2 Can we guarantee that two system calls get interrupted?
-----------------------------------------------------------

In case of competing landlock_restrict_self() calls, I think it is
possible that not all relevant system calls get seen.  The scenario is
one where we have a "red" and "green" thread calling
landlock_restrict_self().

  (a set of additional threads)
  t0: task_works: R G
  t1: task_works: G R
  tR: red thread
  tG: green thread

In the red thread, the following happens:
 * Under RCU, count the number of total threads => get a low number
 * Allocate space for that number of task_works
 * Under RCU
   * Enqueue "R" into t0 and t1
   * Enqueue "R" for some of the "additional threads"
   * But we do not have enough pre-allocated space to enqueue "R" for
     the green thread tG.

The same thing happens in the green thread as well.

The result is that we still have a deadlock between t0 and t1, but
neither the red nor the green thread get interrupted so that they can
resolve it.

(FWIW, you could resolve it from the outside by sending a signal to
the red or green thread manually, but it is not guaranteed to happen
on its own.)

Caveat: I am making pessimistic assumptions about the iteration order
of the task list here, and I am assuming that the number of
"additional threads" is swinging up and down during the competing
enforcement, so that the enforcing threads are mis-approximating the
required space for memory pre-allocation.

2.3 Possible resolutions
------------------------

* We could try to interrupt all sibling threads during the teardown,
  to fix the issue discussed in 2.2. (Downside: Complicated, more
  expensive)
* The reason why landlock_restrict_self() can't return is because it
  needs to wait until all task works are done before it can free the
  memory.  Alternatively, we could make the task works take ownership
  of these memory structures (refcounting the shared_ctx).  (Downside:
  The used memory is not linear to the number of threads any more.)

Side remark: In testing, I had the impression that the
landlock_restrict_self() calls can go into a retry loop for a while
where all competing threads get interrupted all the time; in a debug
build, when the Syzkaller test prints out a line for each attempt,
sometimes it was hanging for seconds and *then* resolving itself
again.

3 Conclusion
---------------

I would prefer if the final solution would not require deadlock
reasoning at that level and we could do it in simpler way.  I
therefore propose to do what Ding Yihan suggested, and what we had
also discussed previously in the code review:

* Let's serialize the landlock_restrict_self()-with-TSYNC operations
  through the cred_guard_mutex.

This will resolve the issue where competing landlock_restrict_self()
calls with TSYNC can deadlock.  It will also remove the jittery
behavior for that worst case where the conflict is resolved through
retry.

So in my mind, we need both patches:

 * The fix to the cleanup path from 1. above, to make interruption
   work more reliably and to correct the misunderstandings in the
   comments.
 * cred_guard_mutex to serialize the TSYNC invocations.

Please let me know what you think.

Thanks,
–Günther

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-23  9:42         ` Günther Noack
@ 2026-02-23 11:29           ` Ding Yihan
  2026-02-23 15:16             ` Günther Noack
  2026-02-24  6:27           ` [PATCH] landlock: Fix deadlock " Yihan Ding
  1 sibling, 1 reply; 27+ messages in thread
From: Ding Yihan @ 2026-02-23 11:29 UTC (permalink / raw)
  To: Günther Noack
  Cc: syzbot, Mickaël Salaün, linux-security-module,
	Jann Horn, Paul Moore

Hi Günther,

Thank you for the detailed analysis and the clear breakdown. 
Apologies for the delayed response. I spent the last couple of days
thoroughly reading through the previous mailing list discussions. I
was trying hard to see if there was any viable pure lockless design
that could solve this concurrency issue while preserving the original
architecture. 

However, after looking at the complexities you outlined, I completely
agree with your conclusion: serializing the TSYNC operations is indeed
the most robust and reasonable path forward to prevent the deadlock.

Regarding the lock choice, since 'cred_guard_mutex' is explicitly
marked as deprecated for new code in the kernel,maybe we can use its
modern replacement: 'exec_update_lock' (using down_write_trylock /
up_write on current->signal). This aligns with the current subsystem
standards and was also briefly touched upon by Jann in the older
discussions.

I fully understand the requirement for the two-part patch series:
1. Cleaning up the cancellation logic and comments.
2. Introducing the serialization lock for TSYNC.

I will take some time to draft and test this patch series properly. 
I also plan to discuss this with my kernel colleagues here at 
UnionTech to see if they have any additional suggestions on the 
implementation details before I submit it.

I will send out the v1 patch series to the list as soon as it is
ready. Thanks again for your guidance and the great discussion!

Best regards,
Yihan Ding

在 2026/2/23 17:42, Günther Noack 写道:
> On Sat, Feb 21, 2026 at 02:19:53PM +0100, Günther Noack wrote:
>> OK, I think I understand now.  Our existing recovery code for this
>> conflict is this:
>>
>> /*
>>  * Decrement num_preparing for current, to undo that we initialized it
>>  * to 1 a few lines above.
>>  */
>> if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
>> 	if (wait_for_completion_interruptible(
>> 		    &shared_ctx.all_prepared)) {
>> 		/* In case of interruption, we need to retry the system call. */
>> 		atomic_set(&shared_ctx.preparation_error,
>> 			   -ERESTARTNOINTR);
>>
>> 		/*
>> 		 * Cancel task works for tasks that did not start running yet,
>> 		 * and decrement all_prepared and num_unfinished accordingly.
>> 		 */
>> 		cancel_tsync_works(&works, &shared_ctx);
>>
>> 		/*
>> 		 * The remaining task works have started running, so waiting for
>> 		 * their completion will finish.
>> 		 */
>> 		wait_for_completion(&shared_ctx.all_prepared);
>> 	}
>> }
>>
>> When I wrote this, I assumed, as the last comment states, that the
>> task works which we could not cancel, are already running.
>>
>> I was wrong there, because I had misunderstood task_work_run().  When
>> the task works get run there, it first *atomically dequeues the entire
>> queue of scheduled task works*, and then runs them sequentially.
>>
>> That is why, if we have one task work that belongs to the first
>> landlock_restrict_self() call and one which belongs to the other, the
>> task work which is scheduled later can (a) not be dequeued with
>> cancel_tsync_works() any more, and (b) also has not started running
>> yet.
>>
>> Now the only thing that is necessary to produce the deadlock is that
>> we have a pair of threads where the task works for the restriction
>> calls have been scheduled in different order.  When the two
>> landlock_restrict_self() calls end up in the recovery path quoted
>> above, they will wait for one of their task works to run which is
>> blocked from running by another task work that is scheduled before and
>> does not finish either.
>>
>> (Just pasting a brain dump here to save you some time hunting for the
>> root cause. I don't know the best solution yet either.)
> 
> Let me propose the following fixes:
> 
> 1. Immediate fix for that specific issue
> ----------------------------------------
> 
> Proposal:
> * Remove the wait_for_completion(&shared_ctx.all_prepared)
>   call in the code snippet above.
> * Rewrite surrounding comments: Be clear about the fact that
>   cancel_tsync_works() is an opportunistic improvement, but we don't
>   have a guarantee at all that it cancels any of the enqueued task
>   works (because task_work_run might already have popped them off).
> 
> This removes the hold-and-wait dependency circle between the threads,
> which produces the observed deadlock.  The way that we shut down now
> is that we exit the main loop (happens already without it, but we
> might also "break" to be explicit).
> 
> I think that this fix or an equivalent one is needed here, because in
> either way, our assumptions in the quoted code above were wrong.
> 
> 
> 2. Can we reason constructively about correctness?
> --------------------------------------------------
> 
> The remaining question: If on the shutdown path, we can not actually
> remove all the enqueued task works, under what circumstances are we
> even able to interrupt and return from the landlock_restrict_self()
> system call?
> 
> 2.1 For n competing restrict_self calls, n-1 of them need to get interrupted
> ----------------------------------------------------------------------------
> 
> To answer this, consider a multithreaded process with threads named
> "red", "green" and "blue" and many additional threads: When "red",
> "green" and "blue" enforce landlock_restrict_self() concurrently, due
> to differing iteration order, we might end up enqueueing the task
> works on other threads in all of the following combinations:
> 
>   t0:  R G B  <- front of queue
>   t1:  R B G
>   t2:  G R B
>   t3:  G B R
>   t4:  B R G
>   t5:  B G R
> 
> In this configuration, for any of the landlock_restrict_self() system
> calls to even return (successfully or unsuccessfully), at least two
> threads must receive an interrupt and therefore remove their enqueued
> task works from the front of the queue.  Assuming those are green and
> blue, we get:
> 
>   t0:  R      <- front of queue
>   t1:  R
>   t2:  G R
>   t3:  G B R
>   t4:  B R
>   t5:  B G R
> 
> (This works because after the patch above, all of the enqueued G and B
> works finish even if there are remaining G and B works that are still
> blocked by an "R" entry.)
> 
> Now, "R" is in the front of the queue, and the
> landlock_restrict_self() call for the red thread can finish normally,
> even without it being interrupted.
> 
> Once the "R" task works are done as well, the remaining G and B works
> can run and finish as well.
> 
> This scheme generalizes: If we have n competing
> landlock_restrict_self() calls, then in worst case, at least n-1 of
> these system calls need to be interrupted so that they can all
> terminate.
> 
> 2.2 Can we guarantee that two system calls get interrupted?
> -----------------------------------------------------------
> 
> In case of competing landlock_restrict_self() calls, I think it is
> possible that not all relevant system calls get seen.  The scenario is
> one where we have a "red" and "green" thread calling
> landlock_restrict_self().
> 
>   (a set of additional threads)
>   t0: task_works: R G
>   t1: task_works: G R
>   tR: red thread
>   tG: green thread
> 
> In the red thread, the following happens:
>  * Under RCU, count the number of total threads => get a low number
>  * Allocate space for that number of task_works
>  * Under RCU
>    * Enqueue "R" into t0 and t1
>    * Enqueue "R" for some of the "additional threads"
>    * But we do not have enough pre-allocated space to enqueue "R" for
>      the green thread tG.
> 
> The same thing happens in the green thread as well.
> 
> The result is that we still have a deadlock between t0 and t1, but
> neither the red nor the green thread get interrupted so that they can
> resolve it.
> 
> (FWIW, you could resolve it from the outside by sending a signal to
> the red or green thread manually, but it is not guaranteed to happen
> on its own.)
> 
> Caveat: I am making pessimistic assumptions about the iteration order
> of the task list here, and I am assuming that the number of
> "additional threads" is swinging up and down during the competing
> enforcement, so that the enforcing threads are mis-approximating the
> required space for memory pre-allocation.
> 
> 2.3 Possible resolutions
> ------------------------
> 
> * We could try to interrupt all sibling threads during the teardown,
>   to fix the issue discussed in 2.2. (Downside: Complicated, more
>   expensive)
> * The reason why landlock_restrict_self() can't return is because it
>   needs to wait until all task works are done before it can free the
>   memory.  Alternatively, we could make the task works take ownership
>   of these memory structures (refcounting the shared_ctx).  (Downside:
>   The used memory is not linear to the number of threads any more.)
> 
> Side remark: In testing, I had the impression that the
> landlock_restrict_self() calls can go into a retry loop for a while
> where all competing threads get interrupted all the time; in a debug
> build, when the Syzkaller test prints out a line for each attempt,
> sometimes it was hanging for seconds and *then* resolving itself
> again.
> 
> 3 Conclusion
> ---------------
> 
> I would prefer if the final solution would not require deadlock
> reasoning at that level and we could do it in simpler way.  I
> therefore propose to do what Ding Yihan suggested, and what we had
> also discussed previously in the code review:
> 
> * Let's serialize the landlock_restrict_self()-with-TSYNC operations
>   through the cred_guard_mutex.
> 
> This will resolve the issue where competing landlock_restrict_self()
> calls with TSYNC can deadlock.  It will also remove the jittery
> behavior for that worst case where the conflict is resolved through
> retry.
> 
> 
> So in my mind, we need both patches:
> 
>  * The fix to the cleanup path from 1. above, to make interruption
>    work more reliably and to correct the misunderstandings in the
>    comments.
>  * cred_guard_mutex to serialize the TSYNC invocations.
> 
> Please let me know what you think.
> 
> Thanks,
> –Günther
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-23 11:29           ` Ding Yihan
@ 2026-02-23 15:16             ` Günther Noack
  2026-02-24  3:02               ` Ding Yihan
  0 siblings, 1 reply; 27+ messages in thread
From: Günther Noack @ 2026-02-23 15:16 UTC (permalink / raw)
  To: Ding Yihan
  Cc: Günther Noack, syzbot, Mickaël Salaün,
	linux-security-module, Jann Horn, Paul Moore

Hello!

On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote:
> Thank you for the detailed analysis and the clear breakdown. 
> Apologies for the delayed response. I spent the last couple of days
> thoroughly reading through the previous mailing list discussions. I
> was trying hard to see if there was any viable pure lockless design
> that could solve this concurrency issue while preserving the original
> architecture. 
> 
> However, after looking at the complexities you outlined, I completely
> agree with your conclusion: serializing the TSYNC operations is indeed
> the most robust and reasonable path forward to prevent the deadlock.
> 
> Regarding the lock choice, since 'cred_guard_mutex' is explicitly
> marked as deprecated for new code in the kernel,maybe we can use its
> modern replacement: 'exec_update_lock' (using down_write_trylock /
> up_write on current->signal). This aligns with the current subsystem
> standards and was also briefly touched upon by Jann in the older
> discussions.
> 
> I fully understand the requirement for the two-part patch series:
> 1. Cleaning up the cancellation logic and comments.
> 2. Introducing the serialization lock for TSYNC.
> 
> I will take some time to draft and test this patch series properly. 
> I also plan to discuss this with my kernel colleagues here at 
> UnionTech to see if they have any additional suggestions on the 
> implementation details before I submit it.
> 
> I will send out the v1 patch series to the list as soon as it is
> ready. Thanks again for your guidance and the great discussion!

Thank you, Ding, this is much appreciated!

I agree, the `exec_update_lock` might be the better solution;
I also need to familiarize myself more with it to double-check.

—Günther

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-23 15:16             ` Günther Noack
@ 2026-02-24  3:02               ` Ding Yihan
  2026-02-24  3:03                 ` syzbot
  0 siblings, 1 reply; 27+ messages in thread
From: Ding Yihan @ 2026-02-24  3:02 UTC (permalink / raw)
  To: Günther Noack
  Cc: Günther Noack, syzbot, Mickaël Salaün,
	linux-security-module, Jann Horn, Paul Moore

Hi Günther,

Thank you for the detailed analysis! I completely agree that serializing the TSYNC 
operations is the right way to prevent this deadlock. I have drafted a patch using 
`exec_update_lock` (similar to how seccomp uses `cred_guard_mutex`).

Regarding your proposal to split this into two patches (one for the cleanup 
path and one for the lock): Maybe combining them into a single patch is a better choice. Here is why:

We actually *cannot* remove `wait_for_completion(&shared_ctx.all_prepared)` 
in the interrupt recovery path. Since `shared_ctx` is allocated on the local 
stack of the caller, removing the wait would cause a severe Use-After-Free (UAF) if the 
thread returns to userspace while sibling task_works are still executing and dereferencing `ctx`. 

By adding the lock, we inherently resolve the deadlock, meaning the sibling task_works 
will never get stuck. Thus, `wait_for_completion` becomes perfectly safe to keep, 
and it remains strictly necessary to protect the stack memory. Therefore, the "fix" for the 
cleanup path is simply updating the comments to reflect this reality, which is tightly coupled with the locking fix. 
It felt more cohesive as a single patch.

I have test the patch on my laptop,and it will not trigger the issue.Let's have syzbot test this combined logic:

#syz test: 

--- a/security/landlock/tsync.c

+++ b/security/landlock/tsync.c

@@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,

        shared_ctx.new_cred = new_cred;

        shared_ctx.set_no_new_privs = task_no_new_privs(current);

 

+       /*

+        * Serialize concurrent TSYNC operations to prevent deadlocks

+        * when multiple threads call landlock_restrict_self() simultaneously.

+        */

+       down_write(&current->signal->exec_update_lock);

+

        /*

         * We schedule a pseudo-signal task_work for each of the calling task's

         * sibling threads.  In the task work, each thread:

@@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,

                                           -ERESTARTNOINTR);

 

                                /*

-                                * Cancel task works for tasks that did not start running yet,

-                                * and decrement all_prepared and num_unfinished accordingly.

+                                * Opportunistic improvement: try to cancel task works

+                                * for tasks that did not start running yet. We do not

+                                * have a guarantee that it cancels any of the enqueued

+                                * task works (because task_work_run() might already have

+                                * dequeued them).

                                 */

                                cancel_tsync_works(&works, &shared_ctx);

 

                                /*

-                                * The remaining task works have started running, so waiting for

-                                * their completion will finish.

+                                * We must wait for the remaining task works to finish to

+                                * prevent a use-after-free of the local shared_ctx.

                                 */

                                wait_for_completion(&shared_ctx.all_prepared);

                        }

@@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,

 

        tsync_works_release(&works);

 

+       up_write(&current->signal->exec_update_lock);

+

        return atomic_read(&shared_ctx.preparation_error);

 }

--
在 2026/2/23 23:16, Günther Noack 写道:
> Hello!
> 
> On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote:
>> Thank you for the detailed analysis and the clear breakdown. 
>> Apologies for the delayed response. I spent the last couple of days
>> thoroughly reading through the previous mailing list discussions. I
>> was trying hard to see if there was any viable pure lockless design
>> that could solve this concurrency issue while preserving the original
>> architecture. 
>> 
>> However, after looking at the complexities you outlined, I completely
>> agree with your conclusion: serializing the TSYNC operations is indeed
>> the most robust and reasonable path forward to prevent the deadlock.
>> 
>> Regarding the lock choice, since 'cred_guard_mutex' is explicitly
>> marked as deprecated for new code in the kernel,maybe we can use its
>> modern replacement: 'exec_update_lock' (using down_write_trylock /
>> up_write on current->signal). This aligns with the current subsystem
>> standards and was also briefly touched upon by Jann in the older
>> discussions.
>> 
>> I fully understand the requirement for the two-part patch series:
>> 1. Cleaning up the cancellation logic and comments.
>> 2. Introducing the serialization lock for TSYNC.
>> 
>> I will take some time to draft and test this patch series properly. 
>> I also plan to discuss this with my kernel colleagues here at 
>> UnionTech to see if they have any additional suggestions on the 
>> implementation details before I submit it.
>> 
>> I will send out the v1 patch series to the list as soon as it is
>> ready. Thanks again for your guidance and the great discussion!
> 
> Thank you, Ding, this is much appreciated!
> 
> I agree, the `exec_update_lock` might be the better solution;
> I also need to familiarize myself more with it to double-check.
> 
> —Günther
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-24  3:02               ` Ding Yihan
@ 2026-02-24  3:03                 ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2026-02-24  3:03 UTC (permalink / raw)
  To: dingyihan
  Cc: dingyihan, gnoack3000, gnoack, jannh, linux-security-module, mic,
	paul, linux-kernel, syzkaller-bugs

> Hi Günther,
>
> Thank you for the detailed analysis! I completely agree that serializing the TSYNC 
> operations is the right way to prevent this deadlock. I have drafted a patch using 
> `exec_update_lock` (similar to how seccomp uses `cred_guard_mutex`).
>
> Regarding your proposal to split this into two patches (one for the cleanup 
> path and one for the lock): Maybe combining them into a single patch is a better choice. Here is why:
>
> We actually *cannot* remove `wait_for_completion(&shared_ctx.all_prepared)` 
> in the interrupt recovery path. Since `shared_ctx` is allocated on the local 
> stack of the caller, removing the wait would cause a severe Use-After-Free (UAF) if the 
> thread returns to userspace while sibling task_works are still executing and dereferencing `ctx`. 
>
> By adding the lock, we inherently resolve the deadlock, meaning the sibling task_works 
> will never get stuck. Thus, `wait_for_completion` becomes perfectly safe to keep, 
> and it remains strictly necessary to protect the stack memory. Therefore, the "fix" for the 
> cleanup path is simply updating the comments to reflect this reality, which is tightly coupled with the locking fix. 
> It felt more cohesive as a single patch.
>
> I have test the patch on my laptop,and it will not trigger the issue.Let's have syzbot test this combined logic:
>
> #syz test: 

"---" does not look like a valid git repo address.

>
> --- a/security/landlock/tsync.c
>
> +++ b/security/landlock/tsync.c
>
> @@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>
>         shared_ctx.new_cred = new_cred;
>
>         shared_ctx.set_no_new_privs = task_no_new_privs(current);
>
>  
>
> +       /*
>
> +        * Serialize concurrent TSYNC operations to prevent deadlocks
>
> +        * when multiple threads call landlock_restrict_self() simultaneously.
>
> +        */
>
> +       down_write(&current->signal->exec_update_lock);
>
> +
>
>         /*
>
>          * We schedule a pseudo-signal task_work for each of the calling task's
>
>          * sibling threads.  In the task work, each thread:
>
> @@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>
>                                            -ERESTARTNOINTR);
>
>  
>
>                                 /*
>
> -                                * Cancel task works for tasks that did not start running yet,
>
> -                                * and decrement all_prepared and num_unfinished accordingly.
>
> +                                * Opportunistic improvement: try to cancel task works
>
> +                                * for tasks that did not start running yet. We do not
>
> +                                * have a guarantee that it cancels any of the enqueued
>
> +                                * task works (because task_work_run() might already have
>
> +                                * dequeued them).
>
>                                  */
>
>                                 cancel_tsync_works(&works, &shared_ctx);
>
>  
>
>                                 /*
>
> -                                * The remaining task works have started running, so waiting for
>
> -                                * their completion will finish.
>
> +                                * We must wait for the remaining task works to finish to
>
> +                                * prevent a use-after-free of the local shared_ctx.
>
>                                  */
>
>                                 wait_for_completion(&shared_ctx.all_prepared);
>
>                         }
>
> @@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>
>  
>
>         tsync_works_release(&works);
>
>  
>
> +       up_write(&current->signal->exec_update_lock);
>
> +
>
>         return atomic_read(&shared_ctx.preparation_error);
>
>  }
>
> --
> 在 2026/2/23 23:16, Günther Noack 写道:
>> Hello!
>> 
>> On Mon, Feb 23, 2026 at 07:29:56PM +0800, Ding Yihan wrote:
>>> Thank you for the detailed analysis and the clear breakdown. 
>>> Apologies for the delayed response. I spent the last couple of days
>>> thoroughly reading through the previous mailing list discussions. I
>>> was trying hard to see if there was any viable pure lockless design
>>> that could solve this concurrency issue while preserving the original
>>> architecture. 
>>> 
>>> However, after looking at the complexities you outlined, I completely
>>> agree with your conclusion: serializing the TSYNC operations is indeed
>>> the most robust and reasonable path forward to prevent the deadlock.
>>> 
>>> Regarding the lock choice, since 'cred_guard_mutex' is explicitly
>>> marked as deprecated for new code in the kernel,maybe we can use its
>>> modern replacement: 'exec_update_lock' (using down_write_trylock /
>>> up_write on current->signal). This aligns with the current subsystem
>>> standards and was also briefly touched upon by Jann in the older
>>> discussions.
>>> 
>>> I fully understand the requirement for the two-part patch series:
>>> 1. Cleaning up the cancellation logic and comments.
>>> 2. Introducing the serialization lock for TSYNC.
>>> 
>>> I will take some time to draft and test this patch series properly. 
>>> I also plan to discuss this with my kernel colleagues here at 
>>> UnionTech to see if they have any additional suggestions on the 
>>> implementation details before I submit it.
>>> 
>>> I will send out the v1 patch series to the list as soon as it is
>>> ready. Thanks again for your guidance and the great discussion!
>> 
>> Thank you, Ding, this is much appreciated!
>> 
>> I agree, the `exec_update_lock` might be the better solution;
>> I also need to familiarize myself more with it to double-check.
>> 
>> —Günther
>> 
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] landlock: Fix deadlock in restrict_one_thread_callback
  2026-02-23  9:42         ` Günther Noack
  2026-02-23 11:29           ` Ding Yihan
@ 2026-02-24  6:27           ` Yihan Ding
  2026-02-24  8:48             ` Günther Noack
  1 sibling, 1 reply; 27+ messages in thread
From: Yihan Ding @ 2026-02-24  6:27 UTC (permalink / raw)
  To: Mickaël Salaün, Günther Noack
  Cc: Paul Moore, Jann Horn, linux-security-module, linux-kernel,
	syzbot+7ea2f5e9dfd468201817, Yihan Ding

syzbot found a deadlock in landlock_restrict_sibling_threads().
When multiple threads concurrently call landlock_restrict_self() with
sibling thread restriction enabled, they can deadlock by mutually
queueing task_works on each other and then blocking in kernel space
(waiting for the other to finish).

Fix this by serializing the TSYNC operations within the same process
using the exec_update_lock. This prevents concurrent invocations
from deadlocking.

Additionally, update the comments in the interrupt recovery path to
clarify that cancel_tsync_works() is an opportunistic cleanup, and
waiting for completion is strictly necessary to prevent a Use-After-Free
of the stack-allocated shared_ctx.

Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()")
Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
---
 security/landlock/tsync.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
index de01aa899751..4e91af271f3b 100644
--- a/security/landlock/tsync.c
+++ b/security/landlock/tsync.c
@@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
 	shared_ctx.new_cred = new_cred;
 	shared_ctx.set_no_new_privs = task_no_new_privs(current);
 
+	/*
+	 * Serialize concurrent TSYNC operations to prevent deadlocks
+	 * when multiple threads call landlock_restrict_self() simultaneously.
+	 */
+	down_write(&current->signal->exec_update_lock);
+
 	/*
 	 * We schedule a pseudo-signal task_work for each of the calling task's
 	 * sibling threads.  In the task work, each thread:
@@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
 					   -ERESTARTNOINTR);
 
 				/*
-				 * Cancel task works for tasks that did not start running yet,
-				 * and decrement all_prepared and num_unfinished accordingly.
+				 * Opportunistic improvement: try to cancel task works
+				 * for tasks that did not start running yet. We do not
+				 * have a guarantee that it cancels any of the enqueued
+				 * task works (because task_work_run() might already have
+				 * dequeued them).
 				 */
 				cancel_tsync_works(&works, &shared_ctx);
 
 				/*
-				 * The remaining task works have started running, so waiting for
-				 * their completion will finish.
+				 * We must wait for the remaining task works to finish to
+				 * prevent a use-after-free of the local shared_ctx.
 				 */
 				wait_for_completion(&shared_ctx.all_prepared);
 			}
@@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
 
 	tsync_works_release(&works);
 
+	up_write(&current->signal->exec_update_lock);
+
 	return atomic_read(&shared_ctx.preparation_error);
 }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH] landlock: Fix deadlock in restrict_one_thread_callback
  2026-02-24  6:27           ` [PATCH] landlock: Fix deadlock " Yihan Ding
@ 2026-02-24  8:48             ` Günther Noack
  0 siblings, 0 replies; 27+ messages in thread
From: Günther Noack @ 2026-02-24  8:48 UTC (permalink / raw)
  To: Yihan Ding
  Cc: Mickaël Salaün, Paul Moore, Jann Horn,
	linux-security-module, linux-kernel, syzbot+7ea2f5e9dfd468201817

Hello!

Thanks for sending the patch!

On Tue, Feb 24, 2026 at 02:27:29PM +0800, Yihan Ding wrote:
> syzbot found a deadlock in landlock_restrict_sibling_threads().
> When multiple threads concurrently call landlock_restrict_self() with
> sibling thread restriction enabled, they can deadlock by mutually
> queueing task_works on each other and then blocking in kernel space
> (waiting for the other to finish).
> 
> Fix this by serializing the TSYNC operations within the same process
> using the exec_update_lock. This prevents concurrent invocations
> from deadlocking.
> 
> Additionally, update the comments in the interrupt recovery path to
> clarify that cancel_tsync_works() is an opportunistic cleanup, and
> waiting for completion is strictly necessary to prevent a Use-After-Free
> of the stack-allocated shared_ctx.
> 
> Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()")
> Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
> ---
>  security/landlock/tsync.c | 19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
> index de01aa899751..4e91af271f3b 100644
> --- a/security/landlock/tsync.c
> +++ b/security/landlock/tsync.c
> @@ -447,6 +447,12 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>  	shared_ctx.new_cred = new_cred;
>  	shared_ctx.set_no_new_privs = task_no_new_privs(current);
>  
> +	/*
> +	 * Serialize concurrent TSYNC operations to prevent deadlocks
> +	 * when multiple threads call landlock_restrict_self() simultaneously.
> +	 */
> +	down_write(&current->signal->exec_update_lock);

Should we use the *_killable variant of this lock acquisition?


>  	/*
>  	 * We schedule a pseudo-signal task_work for each of the calling task's
>  	 * sibling threads.  In the task work, each thread:
> @@ -527,14 +533,17 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>  					   -ERESTARTNOINTR);
>  
>  				/*
> -				 * Cancel task works for tasks that did not start running yet,
> -				 * and decrement all_prepared and num_unfinished accordingly.
> +				 * Opportunistic improvement: try to cancel task works
> +				 * for tasks that did not start running yet. We do not
> +				 * have a guarantee that it cancels any of the enqueued
> +				 * task works (because task_work_run() might already have
> +				 * dequeued them).
>  				 */
>  				cancel_tsync_works(&works, &shared_ctx);
>  
>  				/*
> -				 * The remaining task works have started running, so waiting for
> -				 * their completion will finish.
> +				 * We must wait for the remaining task works to finish to
> +				 * prevent a use-after-free of the local shared_ctx.
>  				 */
>  				wait_for_completion(&shared_ctx.all_prepared);

I do not think that we must wait for all_prepared here, as your
updated comment says: The landlock_restrict_sibling_threads() function
still waits for all of these task works to finish at the bottom where
it waits for "all_finished", so there is no UAF on the local shared
context?

I would recommend replacing the
wait_for_completion(&shared_ctx.all_prepared) call and its comment
with an explicit "break":

/*
 * Break the loop with error.  The cleanup code after the loop
 * unblocks the remaining task_works.
 */
break;

Please also update the comment above the complete_all(ready_to_commit):

  We now have either (a) all sibling threads blocking and in
  "prepared" state in the task work, or (b) the preparation error is
  set.  Ask all threads to commit (or abort).

Then it is a bit more explicit about the error handling variant of this.


(FYI, I have tested the patch variant where I only removed the
wait_for_completion(all_prepared) call, and where I did *not* add the
additional lock at the top.  In this configuration, I was unable to
get it to hang any more, even with added mdelays.  But as discussed in
section 2.2 of [1], there are still difficult to reproduce scenarios
where this can theoretically fail, and it is better to use the lock at
the top.)

[1] https://lore.kernel.org/all/20260223.52c45aed20f8@gnoack.org/

Please also feel free to split up the change into a part that adds the
exec_guard_lock and a part that changes the path where the calling
thread gets interrupted.  Strictly speaking, the part where we change
the interruption logic is only a nicety once we have the
exec_guard_lock in place.

>  			}
> @@ -557,5 +566,7 @@ int landlock_restrict_sibling_threads(const struct cred *old_cred,
>  
>  	tsync_works_release(&works);
>  
> +	up_write(&current->signal->exec_update_lock);
> +
>  	return atomic_read(&shared_ctx.preparation_error);
>  }
> -- 
> 2.51.0
> 

Thanks,
–Günther

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-21  7:28   ` Ding Yihan
  2026-02-21 12:00     ` Günther Noack
@ 2026-02-24 14:43     ` Günther Noack
  1 sibling, 0 replies; 27+ messages in thread
From: Günther Noack @ 2026-02-24 14:43 UTC (permalink / raw)
  To: syzbot; +Cc: linux-security-module

#syz set subsystems: lsm, kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

[parent not found: <7F45E41D790CD8A4+f1dcffc7-5b69-432f-8ad7-e96a3ef66219@uniontech.com>]

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
       [not found] <7F45E41D790CD8A4+f1dcffc7-5b69-432f-8ad7-e96a3ef66219@uniontech.com>
@ 2026-02-24  5:07 ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2026-02-24  5:07 UTC (permalink / raw)
  To: dingyihan, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
Tested-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com

Tested on:

commit:         635c467c Add linux-next specific files for 20260213
git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=124db722580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch:          https://syzkaller.appspot.com/x/patch.diff?x=11522152580000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 27+ messages in thread

[parent not found: <F4E5EFD28BFB7AA6+108340fb-0592-4dd0-9f93-b7a2b760dc5d@uniontech.com>]

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
       [not found] <F4E5EFD28BFB7AA6+108340fb-0592-4dd0-9f93-b7a2b760dc5d@uniontech.com>
@ 2026-02-24  4:08 ` syzbot
  0 siblings, 0 replies; 27+ messages in thread
From: syzbot @ 2026-02-24  4:08 UTC (permalink / raw)
  To: dingyihan, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

 team0: Port device team_slave_1 added
[   77.956359][ T5910] batman_adv: batadv0: Adding interface: batadv_slave_0
[   77.963600][ T5910] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[   77.989888][ T5910] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[   78.004522][ T5910] batman_adv: batadv0: Adding interface: batadv_slave_1
[   78.011605][ T5910] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[   78.038002][ T5910] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[   78.080982][ T5910] hsr_slave_0: entered promiscuous mode
[   78.087753][ T5910] hsr_slave_1: entered promiscuous mode
[   78.233618][ T5910] netdevsim netdevsim0 netdevsim0: renamed from eth0
[   78.246125][ T5910] netdevsim netdevsim0 netdevsim1: renamed from eth1
[   78.256883][ T5910] netdevsim netdevsim0 netdevsim2: renamed from eth2
[   78.267118][ T5910] netdevsim netdevsim0 netdevsim3: renamed from eth3
[   78.297441][ T5910] bridge0: port 2(bridge_slave_1) entered blocking state
[   78.304646][ T5910] bridge0: port 2(bridge_slave_1) entered forwarding state
[   78.312595][ T5910] bridge0: port 1(bridge_slave_0) entered blocking state
[   78.319734][ T5910] bridge0: port 1(bridge_slave_0) entered forwarding state
[   78.384207][ T5910] 8021q: adding VLAN 0 to HW filter on device bond0
[   78.404563][ T1004] bridge0: port 1(bridge_slave_0) entered disabled state
[   78.413314][ T1004] bridge0: port 2(bridge_slave_1) entered disabled state
[   78.428785][ T5910] 8021q: adding VLAN 0 to HW filter on device team0
[   78.443104][   T65] bridge0: port 1(bridge_slave_0) entered blocking state
[   78.450377][   T65] bridge0: port 1(bridge_slave_0) entered forwarding state
[   78.464952][   T12] bridge0: port 2(bridge_slave_1) entered blocking state
[   78.472313][   T12] bridge0: port 2(bridge_slave_1) entered forwarding state
[   78.645479][ T5910] 8021q: adding VLAN 0 to HW filter on device batadv0
[   78.696816][ T5910] veth0_vlan: entered promiscuous mode
[   78.709183][ T5910] veth1_vlan: entered promiscuous mode
[   78.745020][ T5910] veth0_macvtap: entered promiscuous mode
[   78.756102][ T5910] veth1_macvtap: entered promiscuous mode
[   78.779292][ T5910] batman_adv: batadv0: Interface activated: batadv_slave_0
[   78.795080][ T5910] batman_adv: batadv0: Interface activated: batadv_slave_1
[   78.812914][   T12] netdevsim netdevsim0 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[   78.824397][   T12] netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[   78.840142][   T12] netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[   78.850096][   T12] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[   78.975137][   T13] netdevsim netdevsim0 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   79.080924][   T13] netdevsim netdevsim0 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   79.141701][   T13] netdevsim netdevsim0 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   79.219381][   T13] netdevsim netdevsim0 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
2026/02/24 04:06:47 executed programs: 0
[   79.336447][ T5143] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
[   79.347436][ T5143] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
[   79.355112][ T5143] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[   79.363931][ T5143] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[   79.372285][ T5143] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[   79.509332][ T5936] chnl_net:caif_netlink_parms(): no params data found
[   79.580835][ T5936] bridge0: port 1(bridge_slave_0) entered blocking state
[   79.588539][ T5936] bridge0: port 1(bridge_slave_0) entered disabled state
[   79.596107][ T5936] bridge_slave_0: entered allmulticast mode
[   79.603587][ T5936] bridge_slave_0: entered promiscuous mode
[   79.613548][ T5936] bridge0: port 2(bridge_slave_1) entered blocking state
[   79.620886][ T5936] bridge0: port 2(bridge_slave_1) entered disabled state
[   79.628076][ T5936] bridge_slave_1: entered allmulticast mode
[   79.635811][ T5936] bridge_slave_1: entered promiscuous mode
[   79.673346][ T5936] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[   79.687123][ T5936] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[   79.721491][ T5936] team0: Port device team_slave_0 added
[   79.729676][ T5936] team0: Port device team_slave_1 added
[   79.759645][ T5936] batman_adv: batadv0: Adding interface: batadv_slave_0
[   79.766762][ T5936] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[   79.793316][ T5936] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[   79.806250][ T5936] batman_adv: batadv0: Adding interface: batadv_slave_1
[   79.816021][ T5936] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[   79.842146][ T5936] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[   79.891341][ T5936] hsr_slave_0: entered promiscuous mode
[   79.898314][ T5936] hsr_slave_1: entered promiscuous mode
[   79.906509][ T5936] debugfs: 'hsr0' already exists in 'hsr'
[   79.913302][ T5936] Cannot create hsr debugfs directory
[   81.411273][   T51] Bluetooth: hci0: command tx timeout
[   81.652909][   T42] cfg80211: failed to load regulatory.db
[   81.863344][   T13] bridge_slave_1: left allmulticast mode
[   81.869093][   T13] bridge_slave_1: left promiscuous mode
[   81.875947][   T13] bridge0: port 2(bridge_slave_1) entered disabled state
[   81.886804][   T13] bridge_slave_0: left allmulticast mode
[   81.895320][   T13] bridge_slave_0: left promiscuous mode
[   81.902418][   T13] bridge0: port 1(bridge_slave_0) entered disabled state
[   82.056915][   T13] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[   82.067689][   T13] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[   82.077661][   T13] bond0 (unregistering): Released all slaves
[   82.165586][   T13] hsr_slave_0: left promiscuous mode
[   82.171808][   T13] hsr_slave_1: left promiscuous mode
[   82.177965][   T13] batman_adv: batadv0: Interface deactivated: batadv_slave_0
[   82.188972][   T13] batman_adv: batadv0: Removing interface: batadv_slave_0
[   82.201305][   T13] batman_adv: batadv0: Interface deactivated: batadv_slave_1
[   82.208712][   T13] batman_adv: batadv0: Removing interface: batadv_slave_1
[   82.224168][   T13] veth1_macvtap: left promiscuous mode
[   82.229800][   T13] veth0_macvtap: left promiscuous mode
[   82.235949][   T13] veth1_vlan: left promiscuous mode
[   82.243307][   T13] veth0_vlan: left promiscuous mode
[   82.524939][   T13] team0 (unregistering): Port device team_slave_1 removed
[   82.542673][   T13] team0 (unregistering): Port device team_slave_0 removed
[   82.750272][    C1] list_del corruption, ffff88802a450890->next is NULL
[   82.757771][    C1] ------------[ cut here ]------------
[   82.763241][    C1] kernel BUG at lib/list_debug.c:53!
[   82.768569][    C1] Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
[   82.774843][    C1] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted syzkaller #0 PREEMPT(full) 
[   82.783771][    C1] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
[   82.793910][    C1] RIP: 0010:__list_del_entry_valid_or_report+0xdf/0x190
[   82.800835][    C1] Code: 49 39 1f 0f 85 9e 00 00 00 b0 01 5b 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc 48 c7 c7 40 c1 29 8c 48 89 de e8 42 29 65 fc 90 <0f> 0b 48 c7 c7 a0 c1 29 8c 48 89 de e8 30 29 65 fc 90 0f 0b 4c 89
[   82.820420][    C1] RSP: 0018:ffffc90000a08d58 EFLAGS: 00010046
[   82.826480][    C1] RAX: 0000000000000033 RBX: ffff88802a450890 RCX: e58e7f17a2332a00
[   82.834525][    C1] RDX: 0000000000000100 RSI: 0000000000000102 RDI: 0000000000000000
[   82.842485][    C1] RBP: 0000000000000203 R08: ffffc90000a08ae7 R09: 1ffff9200014115c
[   82.850447][    C1] R10: dffffc0000000000 R11: fffff5200014115d R12: 1ffff1100548a112
[   82.858428][    C1] R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
[   82.866388][    C1] FS:  0000000000000000(0000) GS:ffff888125109000(0000) knlGS:0000000000000000
[   82.875311][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   82.881921][    C1] CR2: 00007f396a9de368 CR3: 000000000e74c000 CR4: 00000000003526f0
[   82.889883][    C1] Call Trace:
[   82.893147][    C1]  <IRQ>
[   82.896012][    C1]  dst_destroy+0x202/0x5a0
[   82.900422][    C1]  ? _raw_spin_unlock_irqrestore+0x30/0x80
[   82.906213][    C1]  ? rcu_core+0x751/0x1070
[   82.910617][    C1]  ? __pfx_dst_destroy_rcu+0x10/0x10
[   82.915891][    C1]  rcu_core+0x7cd/0x1070
[   82.920126][    C1]  ? __pfx_rcu_core+0x10/0x10
[   82.924791][    C1]  ? sched_balance_domains+0x13a/0x950
[   82.930242][    C1]  handle_softirqs+0x22a/0x870
[   82.935088][    C1]  ? __irq_exit_rcu+0x5f/0x150
[   82.939842][    C1]  __irq_exit_rcu+0x5f/0x150
[   82.944421][    C1]  irq_exit_rcu+0x9/0x30
[   82.948653][    C1]  sysvec_apic_timer_interrupt+0xa6/0xc0
[   82.954279][    C1]  </IRQ>
[   82.957194][    C1]  <TASK>
[   82.960106][    C1]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[   82.966079][    C1] RIP: 0010:pv_native_safe_halt+0xf/0x20
[   82.971695][    C1] Code: ee 73 02 c3 cc cc cc cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 43 5d 10 00 fb f4 <e9> 7c ea 02 00 cc cc cc cc cc cc cc cc cc cc cc cc 90 90 90 90 90
[   82.991282][    C1] RSP: 0018:ffffc90000197e20 EFLAGS: 00000242
[   82.997334][    C1] RAX: 0000000000045cc1 RBX: ffffffff819ad30d RCX: 0000000080000001
[   83.005293][    C1] RDX: 0000000000000001 RSI: ffffffff8e013334 RDI: ffffffff8c29be00
[   83.013439][    C1] RBP: ffffc90000197f10 R08: ffff8880b853395b R09: 1ffff110170a672b
[   83.021511][    C1] R10: dffffc0000000000 R11: ffffed10170a672c R12: ffffffff9033b6b0
[   83.029470][    C1] R13: 1ffff11003b5e000 R14: 0000000000000001 R15: 0000000000000001
[   83.037433][    C1]  ? do_idle+0x1bd/0x500
[   83.041683][    C1]  default_idle+0x9/0x20
[   83.045914][    C1]  default_idle_call+0x72/0xb0
[   83.050753][    C1]  do_idle+0x1bd/0x500
[   83.054807][    C1]  ? _raw_spin_unlock_irqrestore+0x4c/0x80
[   83.060598][    C1]  ? __pfx_do_idle+0x10/0x10
[   83.065174][    C1]  ? _raw_spin_unlock_irqrestore+0x30/0x80
[   83.071057][    C1]  cpu_startup_entry+0x43/0x60
[   83.075812][    C1]  start_secondary+0x101/0x110
[   83.080648][    C1]  common_startup_64+0x13e/0x147
[   83.085576][    C1]  </TASK>
[   83.088575][    C1] Modules linked in:
[   83.092472][    C1] ---[ end trace 0000000000000000 ]---
[   83.097909][    C1] RIP: 0010:__list_del_entry_valid_or_report+0xdf/0x190
[   83.104866][    C1] Code: 49 39 1f 0f 85 9e 00 00 00 b0 01 5b 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc 48 c7 c7 40 c1 29 8c 48 89 de e8 42 29 65 fc 90 <0f> 0b 48 c7 c7 a0 c1 29 8c 48 89 de e8 30 29 65 fc 90 0f 0b 4c 89
[   83.124458][    C1] RSP: 0018:ffffc90000a08d58 EFLAGS: 00010046
[   83.130525][    C1] RAX: 0000000000000033 RBX: ffff88802a450890 RCX: e58e7f17a2332a00
[   83.138482][    C1] RDX: 0000000000000100 RSI: 0000000000000102 RDI: 0000000000000000
[   83.146435][    C1] RBP: 0000000000000203 R08: ffffc90000a08ae7 R09: 1ffff9200014115c
[   83.154477][    C1] R10: dffffc0000000000 R11: fffff5200014115d R12: 1ffff1100548a112
[   83.162435][    C1] R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
[   83.170390][    C1] FS:  0000000000000000(0000) GS:ffff888125109000(0000) knlGS:0000000000000000
[   83.179304][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   83.185872][    C1] CR2: 00007f396a9de368 CR3: 000000000e74c000 CR4: 00000000003526f0
[   83.193840][    C1] Kernel panic - not syncing: Fatal exception in interrupt
[   83.201135][    C1] Kernel Offset: disabled
[   83.205437][    C1] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE='auto'
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3187194119=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/syzkaller/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'

git status (err=<nil>)
HEAD detached at 1e62d19825
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=1e62d1982527c3b4e18df04d61f2560fa1f434cc -X github.com/google/syzkaller/prog.gitRevisionDate=20260213-152336"  ./sys/syz-sysgen | grep -q false || go install -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=1e62d1982527c3b4e18df04d61f2560fa1f434cc -X github.com/google/syzkaller/prog.gitRevisionDate=20260213-152336"  ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=1e62d1982527c3b4e18df04d61f2560fa1f434cc -X github.com/google/syzkaller/prog.gitRevisionDate=20260213-152336"  -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include   -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"1e62d1982527c3b4e18df04d61f2560fa1f434cc\"
/usr/bin/ld: /tmp/ccAQ1hzd.o: in function `Connection::Connect(char const*, char const*)':
executor.cc:(.text._ZN10Connection7ConnectEPKcS1_[_ZN10Connection7ConnectEPKcS1_]+0x386): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
./tools/check-syzos.sh 2>/dev/null


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=11c2155a580000


Tested on:

commit:         779cae95 Add linux-next specific files for 20260223
git tree:       linux-next
kernel config:  https://syzkaller.appspot.com/x/.config?x=ee920513e4deca5f
dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1782455a580000


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
@ 2026-02-20 11:11 syzbot
  2026-02-23 13:40 ` Frederic Weisbecker
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: syzbot @ 2026-02-20 11:11 UTC (permalink / raw)
  To: anna-maria, frederic, linux-kernel, syzkaller-bugs, tglx

Hello,

syzbot found the following issue on:

HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/78b3d15ca8e6/disk-635c467c.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a95f3d108ef4/vmlinux-635c467c.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e58086838b24/bzImage-635c467c.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com

INFO: task syz.0.2812:14643 blocked for more than 143 seconds.
      Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.2812      state:D stack:25600 pid:14643 tgid:14643 ppid:13375  task_flags:0x400040 flags:0x00080002
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5295 [inline]
 __schedule+0x1585/0x5340 kernel/sched/core.c:6907
 __schedule_loop kernel/sched/core.c:6989 [inline]
 schedule+0x164/0x360 kernel/sched/core.c:7004
 schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
 do_wait_for_common kernel/sched/completion.c:100 [inline]
 __wait_for_common kernel/sched/completion.c:121 [inline]
 wait_for_common kernel/sched/completion.c:132 [inline]
 wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
 restrict_one_thread security/landlock/tsync.c:128 [inline]
 restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162
 task_work_run+0x1d9/0x270 kernel/task_work.c:233
 get_signal+0x11eb/0x1330 kernel/signal.c:2807
 arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
 do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8d7f19bf79
RSP: 002b:00007ffe0b192a38 EFLAGS: 00000246 ORIG_RAX: 00000000000000db
RAX: fffffffffffffdfc RBX: 00000000000389f1 RCX: 00007f8d7f19bf79
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f8d7f41618c
RBP: 0000000000000032 R08: 3fffffffffffffff R09: 0000000000000000
R10: 00007ffe0b192b40 R11: 0000000000000246 R12: 00007ffe0b192b60
R13: 00007f8d7f41618c R14: 0000000000038a23 R15: 00007ffe0b192b40
 </TASK>
INFO: task syz.0.2812:14644 blocked for more than 143 seconds.
      Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.2812      state:D stack:28216 pid:14644 tgid:14643 ppid:13375  task_flags:0x400040 flags:0x00080002
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5295 [inline]
 __schedule+0x1585/0x5340 kernel/sched/core.c:6907
 __schedule_loop kernel/sched/core.c:6989 [inline]
 schedule+0x164/0x360 kernel/sched/core.c:7004
 schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
 do_wait_for_common kernel/sched/completion.c:100 [inline]
 __wait_for_common kernel/sched/completion.c:121 [inline]
 wait_for_common kernel/sched/completion.c:132 [inline]
 wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
 restrict_one_thread security/landlock/tsync.c:128 [inline]
 restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162
 task_work_run+0x1d9/0x270 kernel/task_work.c:233
 get_signal+0x11eb/0x1330 kernel/signal.c:2807
 arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337
 __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
 exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98
 __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
 syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
 syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
 do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8d7f19bf79
RSP: 002b:00007f8d8007c0e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f8d7f415fa8 RCX: 00007f8d7f19bf79
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f8d7f415fa8
RBP: 00007f8d7f415fa0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f8d7f416038 R14: 00007ffe0b1927f0 R15: 00007ffe0b1928d8
 </TASK>
INFO: task syz.0.2812:14645 blocked for more than 143 seconds.
      Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.2812      state:D stack:28648 pid:14645 tgid:14643 ppid:13375  task_flags:0x400140 flags:0x00080006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5295 [inline]
 __schedule+0x1585/0x5340 kernel/sched/core.c:6907
 __schedule_loop kernel/sched/core.c:6989 [inline]
 schedule+0x164/0x360 kernel/sched/core.c:7004
 schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
 do_wait_for_common kernel/sched/completion.c:100 [inline]
 __wait_for_common kernel/sched/completion.c:121 [inline]
 wait_for_common kernel/sched/completion.c:132 [inline]
 wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
 landlock_restrict_sibling_threads+0xe9c/0x11f0 security/landlock/tsync.c:539
 __do_sys_landlock_restrict_self security/landlock/syscalls.c:574 [inline]
 __se_sys_landlock_restrict_self+0x540/0x810 security/landlock/syscalls.c:482
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8d7f19bf79
RSP: 002b:00007f8d8005b028 EFLAGS: 00000246 ORIG_RAX: 00000000000001be
RAX: ffffffffffffffda RBX: 00007f8d7f416090 RCX: 00007f8d7f19bf79
RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000003
RBP: 00007f8d7f2327e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f8d7f416128 R14: 00007f8d7f416090 R15: 00007ffe0b1928d8
 </TASK>
INFO: task syz.0.2812:14646 blocked for more than 144 seconds.
      Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.2812      state:D stack:28832 pid:14646 tgid:14643 ppid:13375  task_flags:0x400140 flags:0x00080006
Call Trace:
 <TASK>
 context_switch kernel/sched/core.c:5295 [inline]
 __schedule+0x1585/0x5340 kernel/sched/core.c:6907
 __schedule_loop kernel/sched/core.c:6989 [inline]
 schedule+0x164/0x360 kernel/sched/core.c:7004
 schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
 do_wait_for_common kernel/sched/completion.c:100 [inline]
 __wait_for_common kernel/sched/completion.c:121 [inline]
 wait_for_common kernel/sched/completion.c:132 [inline]
 wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
 landlock_restrict_sibling_threads+0xe9c/0x11f0 security/landlock/tsync.c:539
 __do_sys_landlock_restrict_self security/landlock/syscalls.c:574 [inline]
 __se_sys_landlock_restrict_self+0x540/0x810 security/landlock/syscalls.c:482
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f8d7f19bf79
RSP: 002b:00007f8d8003a028 EFLAGS: 00000246 ORIG_RAX: 00000000000001be
RAX: ffffffffffffffda RBX: 00007f8d7f416180 RCX: 00007f8d7f19bf79
RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000003
RBP: 00007f8d7f2327e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f8d7f416218 R14: 00007f8d7f416180 R15: 00007ffe0b1928d8
 </TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/31:
 #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
 #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:850 [inline]
 #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 kernel/locking/lockdep.c:6775
2 locks held by getty/5581:
 #0: ffff8880328890a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
 #1: ffffc9000332b2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x45c/0x13c0 drivers/tty/n_tty.c:2211

=============================================

NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 31 Comm: khungtaskd Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
Call Trace:
 <TASK>
 dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
 nmi_cpu_backtrace+0x274/0x2d0 lib/nmi_backtrace.c:113
 nmi_trigger_cpumask_backtrace+0x17a/0x300 lib/nmi_backtrace.c:62
 trigger_all_cpu_backtrace include/linux/nmi.h:161 [inline]
 __sys_info lib/sys_info.c:157 [inline]
 sys_info+0x135/0x170 lib/sys_info.c:165
 check_hung_uninterruptible_tasks kernel/hung_task.c:346 [inline]
 watchdog+0xfd9/0x1030 kernel/hung_task.c:515
 kthread+0x388/0x470 kernel/kthread.c:467
 ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 86 Comm: kworker/u8:5 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
Workqueue: events_unbound nsim_dev_trap_report_work
RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:26 [inline]
RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:109 [inline]
RIP: 0010:arch_local_irq_save arch/x86/include/asm/irqflags.h:127 [inline]
RIP: 0010:lock_acquire+0xab/0x2e0 kernel/locking/lockdep.c:5864
Code: 84 c1 00 00 00 65 8b 05 73 b8 9f 11 85 c0 0f 85 b2 00 00 00 65 48 8b 05 bb 72 9f 11 83 b8 14 0b 00 00 00 0f 85 9d 00 00 00 9c <5b> fa 48 c7 c7 8f a1 02 8e e8 57 40 17 0a 65 ff 05 40 b8 9f 11 45
RSP: 0018:ffffc9000260f498 EFLAGS: 00000246
RAX: ffff88801df81e40 RBX: ffffffff818f9166 RCX: 0000000080000002
RDX: 0000000000000000 RSI: ffffffff8176da62 RDI: 1ffffffff1d2c05c
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffc9000260f638 R11: ffffffff81b11580 R12: 0000000000000002
R13: ffffffff8e9602e0 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88812510b000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe09b2c1ff8 CR3: 000000000e74c000 CR4: 00000000003526f0
Call Trace:
 <TASK>
 rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
 rcu_read_lock include/linux/rcupdate.h:850 [inline]
 class_rcu_constructor include/linux/rcupdate.h:1193 [inline]
 unwind_next_frame+0xc2/0x23c0 arch/x86/kernel/unwind_orc.c:495
 arch_stack_walk+0x11b/0x150 arch/x86/kernel/stacktrace.c:25
 stack_trace_save+0xa9/0x100 kernel/stacktrace.c:122
 kasan_save_stack mm/kasan/common.c:57 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
 unpoison_slab_object mm/kasan/common.c:340 [inline]
 __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
 kasan_slab_alloc include/linux/kasan.h:253 [inline]
 slab_post_alloc_hook mm/slub.c:4501 [inline]
 slab_alloc_node mm/slub.c:4830 [inline]
 kmem_cache_alloc_node_noprof+0x384/0x690 mm/slub.c:4882
 __alloc_skb+0x1d0/0x7d0 net/core/skbuff.c:702
 alloc_skb include/linux/skbuff.h:1383 [inline]
 nsim_dev_trap_skb_build drivers/net/netdevsim/dev.c:819 [inline]
 nsim_dev_trap_report drivers/net/netdevsim/dev.c:876 [inline]
 nsim_dev_trap_report_work+0x29a/0xb80 drivers/net/netdevsim/dev.c:922
 process_one_work+0x949/0x1650 kernel/workqueue.c:3279
 process_scheduled_works kernel/workqueue.c:3362 [inline]
 worker_thread+0xb46/0x1140 kernel/workqueue.c:3443
 kthread+0x388/0x470 kernel/kthread.c:467
 ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
 </TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
@ 2026-02-23 13:40 ` Frederic Weisbecker
  2026-02-23 15:15   ` Günther Noack
  2026-02-24  0:10 ` Hillf Danton
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Frederic Weisbecker @ 2026-02-23 13:40 UTC (permalink / raw)
  To: syzbot, Mickaël Salaün, Günther Noack, Paul Moore,
	James Morris, Serge E. Hallyn, linux-security-module
  Cc: anna-maria, linux-kernel, syzkaller-bugs, tglx

Le Fri, Feb 20, 2026 at 03:11:21AM -0800, syzbot a écrit :
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/78b3d15ca8e6/disk-635c467c.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/a95f3d108ef4/vmlinux-635c467c.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/e58086838b24/bzImage-635c467c.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
> 
> INFO: task syz.0.2812:14643 blocked for more than 143 seconds.
>       Not tainted syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz.0.2812      state:D stack:25600 pid:14643 tgid:14643 ppid:13375  task_flags:0x400040 flags:0x00080002
> Call Trace:
>  <TASK>
>  context_switch kernel/sched/core.c:5295 [inline]
>  __schedule+0x1585/0x5340 kernel/sched/core.c:6907
>  __schedule_loop kernel/sched/core.c:6989 [inline]
>  schedule+0x164/0x360 kernel/sched/core.c:7004
>  schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>  __wait_for_common kernel/sched/completion.c:121 [inline]
>  wait_for_common kernel/sched/completion.c:132 [inline]
>  wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
>  restrict_one_thread security/landlock/tsync.c:128 [inline]
>  restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162

Seems to be related to landlock security module.
Cc'ing maintainers for awareness.

Thanks.

>  task_work_run+0x1d9/0x270 kernel/task_work.c:233
>  get_signal+0x11eb/0x1330 kernel/signal.c:2807
>  arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337
>  __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
>  exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98
>  __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
>  syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
>  syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
>  do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f8d7f19bf79
> RSP: 002b:00007ffe0b192a38 EFLAGS: 00000246 ORIG_RAX: 00000000000000db
> RAX: fffffffffffffdfc RBX: 00000000000389f1 RCX: 00007f8d7f19bf79
> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f8d7f41618c
> RBP: 0000000000000032 R08: 3fffffffffffffff R09: 0000000000000000
> R10: 00007ffe0b192b40 R11: 0000000000000246 R12: 00007ffe0b192b60
> R13: 00007f8d7f41618c R14: 0000000000038a23 R15: 00007ffe0b192b40
>  </TASK>
> INFO: task syz.0.2812:14644 blocked for more than 143 seconds.
>       Not tainted syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz.0.2812      state:D stack:28216 pid:14644 tgid:14643 ppid:13375  task_flags:0x400040 flags:0x00080002
> Call Trace:
>  <TASK>
>  context_switch kernel/sched/core.c:5295 [inline]
>  __schedule+0x1585/0x5340 kernel/sched/core.c:6907
>  __schedule_loop kernel/sched/core.c:6989 [inline]
>  schedule+0x164/0x360 kernel/sched/core.c:7004
>  schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>  __wait_for_common kernel/sched/completion.c:121 [inline]
>  wait_for_common kernel/sched/completion.c:132 [inline]
>  wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
>  restrict_one_thread security/landlock/tsync.c:128 [inline]
>  restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162
>  task_work_run+0x1d9/0x270 kernel/task_work.c:233
>  get_signal+0x11eb/0x1330 kernel/signal.c:2807
>  arch_do_signal_or_restart+0xbc/0x830 arch/x86/kernel/signal.c:337
>  __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
>  exit_to_user_mode_loop+0x86/0x480 kernel/entry/common.c:98
>  __exit_to_user_mode_prepare include/linux/irq-entry-common.h:226 [inline]
>  syscall_exit_to_user_mode_prepare include/linux/irq-entry-common.h:256 [inline]
>  syscall_exit_to_user_mode include/linux/entry-common.h:325 [inline]
>  do_syscall_64+0x32d/0xf80 arch/x86/entry/syscall_64.c:100
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f8d7f19bf79
> RSP: 002b:00007f8d8007c0e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffe00 RBX: 00007f8d7f415fa8 RCX: 00007f8d7f19bf79
> RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f8d7f415fa8
> RBP: 00007f8d7f415fa0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f8d7f416038 R14: 00007ffe0b1927f0 R15: 00007ffe0b1928d8
>  </TASK>
> INFO: task syz.0.2812:14645 blocked for more than 143 seconds.
>       Not tainted syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz.0.2812      state:D stack:28648 pid:14645 tgid:14643 ppid:13375  task_flags:0x400140 flags:0x00080006
> Call Trace:
>  <TASK>
>  context_switch kernel/sched/core.c:5295 [inline]
>  __schedule+0x1585/0x5340 kernel/sched/core.c:6907
>  __schedule_loop kernel/sched/core.c:6989 [inline]
>  schedule+0x164/0x360 kernel/sched/core.c:7004
>  schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>  __wait_for_common kernel/sched/completion.c:121 [inline]
>  wait_for_common kernel/sched/completion.c:132 [inline]
>  wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
>  landlock_restrict_sibling_threads+0xe9c/0x11f0 security/landlock/tsync.c:539
>  __do_sys_landlock_restrict_self security/landlock/syscalls.c:574 [inline]
>  __se_sys_landlock_restrict_self+0x540/0x810 security/landlock/syscalls.c:482
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f8d7f19bf79
> RSP: 002b:00007f8d8005b028 EFLAGS: 00000246 ORIG_RAX: 00000000000001be
> RAX: ffffffffffffffda RBX: 00007f8d7f416090 RCX: 00007f8d7f19bf79
> RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000003
> RBP: 00007f8d7f2327e0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f8d7f416128 R14: 00007f8d7f416090 R15: 00007ffe0b1928d8
>  </TASK>
> INFO: task syz.0.2812:14646 blocked for more than 144 seconds.
>       Not tainted syzkaller #0
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> task:syz.0.2812      state:D stack:28832 pid:14646 tgid:14643 ppid:13375  task_flags:0x400140 flags:0x00080006
> Call Trace:
>  <TASK>
>  context_switch kernel/sched/core.c:5295 [inline]
>  __schedule+0x1585/0x5340 kernel/sched/core.c:6907
>  __schedule_loop kernel/sched/core.c:6989 [inline]
>  schedule+0x164/0x360 kernel/sched/core.c:7004
>  schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
>  do_wait_for_common kernel/sched/completion.c:100 [inline]
>  __wait_for_common kernel/sched/completion.c:121 [inline]
>  wait_for_common kernel/sched/completion.c:132 [inline]
>  wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
>  landlock_restrict_sibling_threads+0xe9c/0x11f0 security/landlock/tsync.c:539
>  __do_sys_landlock_restrict_self security/landlock/syscalls.c:574 [inline]
>  __se_sys_landlock_restrict_self+0x540/0x810 security/landlock/syscalls.c:482
>  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>  do_syscall_64+0x14d/0xf80 arch/x86/entry/syscall_64.c:94
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7f8d7f19bf79
> RSP: 002b:00007f8d8003a028 EFLAGS: 00000246 ORIG_RAX: 00000000000001be
> RAX: ffffffffffffffda RBX: 00007f8d7f416180 RCX: 00007f8d7f19bf79
> RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000003
> RBP: 00007f8d7f2327e0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f8d7f416218 R14: 00007f8d7f416180 R15: 00007ffe0b1928d8
>  </TASK>
> 
> Showing all locks held in the system:
> 1 lock held by khungtaskd/31:
>  #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
>  #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:850 [inline]
>  #0: ffffffff8e9602e0 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 kernel/locking/lockdep.c:6775
> 2 locks held by getty/5581:
>  #0: ffff8880328890a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
>  #1: ffffc9000332b2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x45c/0x13c0 drivers/tty/n_tty.c:2211
> 
> =============================================
> 
> NMI backtrace for cpu 0
> CPU: 0 UID: 0 PID: 31 Comm: khungtaskd Not tainted syzkaller #0 PREEMPT(full) 
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120
>  nmi_cpu_backtrace+0x274/0x2d0 lib/nmi_backtrace.c:113
>  nmi_trigger_cpumask_backtrace+0x17a/0x300 lib/nmi_backtrace.c:62
>  trigger_all_cpu_backtrace include/linux/nmi.h:161 [inline]
>  __sys_info lib/sys_info.c:157 [inline]
>  sys_info+0x135/0x170 lib/sys_info.c:165
>  check_hung_uninterruptible_tasks kernel/hung_task.c:346 [inline]
>  watchdog+0xfd9/0x1030 kernel/hung_task.c:515
>  kthread+0x388/0x470 kernel/kthread.c:467
>  ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>  </TASK>
> Sending NMI from CPU 0 to CPUs 1:
> NMI backtrace for cpu 1
> CPU: 1 UID: 0 PID: 86 Comm: kworker/u8:5 Not tainted syzkaller #0 PREEMPT(full) 
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
> Workqueue: events_unbound nsim_dev_trap_report_work
> RIP: 0010:native_save_fl arch/x86/include/asm/irqflags.h:26 [inline]
> RIP: 0010:arch_local_save_flags arch/x86/include/asm/irqflags.h:109 [inline]
> RIP: 0010:arch_local_irq_save arch/x86/include/asm/irqflags.h:127 [inline]
> RIP: 0010:lock_acquire+0xab/0x2e0 kernel/locking/lockdep.c:5864
> Code: 84 c1 00 00 00 65 8b 05 73 b8 9f 11 85 c0 0f 85 b2 00 00 00 65 48 8b 05 bb 72 9f 11 83 b8 14 0b 00 00 00 0f 85 9d 00 00 00 9c <5b> fa 48 c7 c7 8f a1 02 8e e8 57 40 17 0a 65 ff 05 40 b8 9f 11 45
> RSP: 0018:ffffc9000260f498 EFLAGS: 00000246
> RAX: ffff88801df81e40 RBX: ffffffff818f9166 RCX: 0000000080000002
> RDX: 0000000000000000 RSI: ffffffff8176da62 RDI: 1ffffffff1d2c05c
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> R10: ffffc9000260f638 R11: ffffffff81b11580 R12: 0000000000000002
> R13: ffffffff8e9602e0 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff88812510b000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fe09b2c1ff8 CR3: 000000000e74c000 CR4: 00000000003526f0
> Call Trace:
>  <TASK>
>  rcu_lock_acquire include/linux/rcupdate.h:312 [inline]
>  rcu_read_lock include/linux/rcupdate.h:850 [inline]
>  class_rcu_constructor include/linux/rcupdate.h:1193 [inline]
>  unwind_next_frame+0xc2/0x23c0 arch/x86/kernel/unwind_orc.c:495
>  arch_stack_walk+0x11b/0x150 arch/x86/kernel/stacktrace.c:25
>  stack_trace_save+0xa9/0x100 kernel/stacktrace.c:122
>  kasan_save_stack mm/kasan/common.c:57 [inline]
>  kasan_save_track+0x3e/0x80 mm/kasan/common.c:78
>  unpoison_slab_object mm/kasan/common.c:340 [inline]
>  __kasan_slab_alloc+0x6c/0x80 mm/kasan/common.c:366
>  kasan_slab_alloc include/linux/kasan.h:253 [inline]
>  slab_post_alloc_hook mm/slub.c:4501 [inline]
>  slab_alloc_node mm/slub.c:4830 [inline]
>  kmem_cache_alloc_node_noprof+0x384/0x690 mm/slub.c:4882
>  __alloc_skb+0x1d0/0x7d0 net/core/skbuff.c:702
>  alloc_skb include/linux/skbuff.h:1383 [inline]
>  nsim_dev_trap_skb_build drivers/net/netdevsim/dev.c:819 [inline]
>  nsim_dev_trap_report drivers/net/netdevsim/dev.c:876 [inline]
>  nsim_dev_trap_report_work+0x29a/0xb80 drivers/net/netdevsim/dev.c:922
>  process_one_work+0x949/0x1650 kernel/workqueue.c:3279
>  process_scheduled_works kernel/workqueue.c:3362 [inline]
>  worker_thread+0xb46/0x1140 kernel/workqueue.c:3443
>  kthread+0x388/0x470 kernel/kthread.c:467
>  ret_from_fork+0x51e/0xb90 arch/x86/kernel/process.c:158
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
>  </TASK>
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> 
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
> 
> If you want syzbot to run the reproducer, reply with:
> #syz test: git://repo/address.git branch-or-commit-hash
> If you attach or paste a git patch, syzbot will apply it before testing.
> 
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
> 
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
> 
> If you want to undo deduplication, reply with:
> #syz undup

-- 
Frederic Weisbecker
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-23 13:40 ` Frederic Weisbecker
@ 2026-02-23 15:15   ` Günther Noack
  0 siblings, 0 replies; 27+ messages in thread
From: Günther Noack @ 2026-02-23 15:15 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: syzbot, Mickaël Salaün, Paul Moore, James Morris,
	Serge E. Hallyn, linux-security-module, anna-maria, linux-kernel,
	syzkaller-bugs, tglx

On Mon, Feb 23, 2026 at 02:40:15PM +0100, Frederic Weisbecker wrote:
> Le Fri, Feb 20, 2026 at 03:11:21AM -0800, syzbot a écrit :
> > Call Trace:
> >  <TASK>
> >  context_switch kernel/sched/core.c:5295 [inline]
> >  __schedule+0x1585/0x5340 kernel/sched/core.c:6907
> >  __schedule_loop kernel/sched/core.c:6989 [inline]
> >  schedule+0x164/0x360 kernel/sched/core.c:7004
> >  schedule_timeout+0xc3/0x2c0 kernel/time/sleep_timeout.c:75
> >  do_wait_for_common kernel/sched/completion.c:100 [inline]
> >  __wait_for_common kernel/sched/completion.c:121 [inline]
> >  wait_for_common kernel/sched/completion.c:132 [inline]
> >  wait_for_completion+0x2cc/0x5e0 kernel/sched/completion.c:153
> >  restrict_one_thread security/landlock/tsync.c:128 [inline]
> >  restrict_one_thread_callback+0x320/0x570 security/landlock/tsync.c:162
> 
> Seems to be related to landlock security module.
> Cc'ing maintainers for awareness.

Thank you!  That is correct.  We are already discussing it in
https://lore.kernel.org/all/00A9E53EDC82309F+7b1dfc69-95f8-4ffc-a67c-967de0e2dfee@uniontech.com/

—Günther

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
  2026-02-23 13:40 ` Frederic Weisbecker
@ 2026-02-24  0:10 ` Hillf Danton
  2026-02-24  3:05   ` syzbot
  2026-02-24 10:00   ` Günther Noack
  2026-02-25  5:10 ` Hillf Danton
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-24  0:10 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -540,11 +540,8 @@ int landlock_restrict_sibling_threads(co
 		 * of for_each_thread().  We can reset it on each loop iteration because
 		 * all previous loop iterations are done with it already.
 		 *
-		 * num_preparing is initialized to 1 so that the counter can not go to 0
-		 * and mark the completion as done before all task works are registered.
-		 * We decrement it at the end of the loop body.
 		 */
-		atomic_set(&shared_ctx.num_preparing, 1);
+		atomic_set(&shared_ctx.num_preparing, 0);
 		reinit_completion(&shared_ctx.all_prepared);
 
 		/*
@@ -553,11 +550,7 @@ int landlock_restrict_sibling_threads(co
 		 */
 		found_more_threads = schedule_task_work(&works, &shared_ctx);
 
-		/*
-		 * Decrement num_preparing for current, to undo that we initialized it
-		 * to 1 a few lines above.
-		 */
-		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
+		if (atomic_read(&shared_ctx.num_preparing) > 0) {
 			if (wait_for_completion_interruptible(
 				    &shared_ctx.all_prepared)) {
 				/* In case of interruption, we need to retry the system call. */
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-24  0:10 ` Hillf Danton
@ 2026-02-24  3:05   ` syzbot
  2026-02-24 10:00   ` Günther Noack
  1 sibling, 0 replies; 27+ messages in thread
From: syzbot @ 2026-02-24  3:05 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

batadv0: Interface activated: batadv_slave_0
[   74.866770][ T5874] batman_adv: batadv0: Interface activated: batadv_slave_1
[   74.884996][   T58] netdevsim netdevsim0 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[   74.894310][   T58] netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[   74.905283][   T58] netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[   74.914746][   T58] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[   75.043478][   T58] netdevsim netdevsim0 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   75.127104][   T58] netdevsim netdevsim0 netdevsim2 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   75.185369][   T58] netdevsim netdevsim0 netdevsim1 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   75.334360][   T58] netdevsim netdevsim0 netdevsim0 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0
[   75.469719][   T49] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   75.490337][   T49] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[   75.518222][   T49] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[   75.526524][   T49] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
2026/02/24 03:03:58 executed programs: 0
[   76.930172][   T51] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
[   76.938217][   T51] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
[   76.948739][   T51] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[   76.957301][   T51] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[   76.965874][   T51] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[   77.092846][ T5933] chnl_net:caif_netlink_parms(): no params data found
[   77.162916][ T5933] bridge0: port 1(bridge_slave_0) entered blocking state
[   77.170555][ T5933] bridge0: port 1(bridge_slave_0) entered disabled state
[   77.178400][ T5933] bridge_slave_0: entered allmulticast mode
[   77.187301][ T5933] bridge_slave_0: entered promiscuous mode
[   77.195638][ T5933] bridge0: port 2(bridge_slave_1) entered blocking state
[   77.202901][ T5933] bridge0: port 2(bridge_slave_1) entered disabled state
[   77.210072][ T5933] bridge_slave_1: entered allmulticast mode
[   77.217641][ T5933] bridge_slave_1: entered promiscuous mode
[   77.246869][ T5933] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[   77.259177][ T5933] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[   77.296961][ T5933] team0: Port device team_slave_0 added
[   77.305577][ T5933] team0: Port device team_slave_1 added
[   77.330912][ T5933] batman_adv: batadv0: Adding interface: batadv_slave_0
[   77.337915][ T5933] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[   77.363986][ T5933] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[   77.376893][ T5933] batman_adv: batadv0: Adding interface: batadv_slave_1
[   77.383931][ T5933] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[   77.409943][ T5933] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[   77.455317][ T5933] hsr_slave_0: entered promiscuous mode
[   77.463048][ T5933] hsr_slave_1: entered promiscuous mode
[   77.469984][ T5933] debugfs: 'hsr0' already exists in 'hsr'
[   77.475853][ T5933] Cannot create hsr debugfs directory
[   77.989478][   T58] bridge_slave_1: left allmulticast mode
[   77.997034][   T58] bridge_slave_1: left promiscuous mode
[   78.003996][   T58] bridge0: port 2(bridge_slave_1) entered disabled state
[   78.014607][   T58] bridge_slave_0: left allmulticast mode
[   78.020289][   T58] bridge_slave_0: left promiscuous mode
[   78.027158][   T58] bridge0: port 1(bridge_slave_0) entered disabled state
[   78.170497][   T58] bond0 (unregistering): (slave bond_slave_0): Releasing backup interface
[   78.181273][   T58] bond0 (unregistering): (slave bond_slave_1): Releasing backup interface
[   78.191450][   T58] bond0 (unregistering): Released all slaves
[   78.312349][   T58] hsr_slave_0: left promiscuous mode
[   78.318549][   T58] hsr_slave_1: left promiscuous mode
[   78.327463][   T58] batman_adv: batadv0: Interface deactivated: batadv_slave_0
[   78.335002][   T58] batman_adv: batadv0: Removing interface: batadv_slave_0
[   78.348743][   T58] batman_adv: batadv0: Interface deactivated: batadv_slave_1
[   78.357270][   T58] batman_adv: batadv0: Removing interface: batadv_slave_1
[   78.372232][   T58] veth1_macvtap: left promiscuous mode
[   78.377993][   T58] veth0_macvtap: left promiscuous mode
[   78.384521][   T58] veth1_vlan: left promiscuous mode
[   78.390232][   T58] veth0_vlan: left promiscuous mode
[   78.679788][   T58] team0 (unregistering): Port device team_slave_1 removed
[   78.694967][   T58] team0 (unregistering): Port device team_slave_0 removed
[   78.820049][    C0] list_del corruption, ffff88806e888490->next is NULL
[   78.827266][    C0] ------------[ cut here ]------------
[   78.832719][    C0] kernel BUG at lib/list_debug.c:53!
[   78.838048][    C0] Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
[   78.844303][    C0] CPU: 0 UID: 0 PID: 5487 Comm: dhcpcd Not tainted syzkaller #0 PREEMPT(full) 
[   78.853223][    C0] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2026
[   78.863256][    C0] RIP: 0010:__list_del_entry_valid_or_report+0xdf/0x190
[   78.870177][    C0] Code: 49 39 1f 0f 85 9e 00 00 00 b0 01 5b 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc 48 c7 c7 40 c1 29 8c 48 89 de e8 c2 29 65 fc 90 <0f> 0b 48 c7 c7 a0 c1 29 8c 48 89 de e8 b0 29 65 fc 90 0f 0b 4c 89
[   78.889849][    C0] RSP: 0018:ffffc90000007d58 EFLAGS: 00010046
[   78.895995][    C0] RAX: 0000000000000033 RBX: ffff88806e888490 RCX: f63d3b529a1a7600
[   78.903953][    C0] RDX: 0000000000000100 RSI: 0000000080000102 RDI: 0000000000000000
[   78.911915][    C0] RBP: 0000000000000203 R08: ffffc90000007ae7 R09: 1ffff92000000f5c
[   78.919872][    C0] R10: dffffc0000000000 R11: fffff52000000f5d R12: 1ffff1100dd11092
[   78.927827][    C0] R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
[   78.935780][    C0] FS:  00007f45c61e0740(0000) GS:ffff888125009000(0000) knlGS:0000000000000000
[   78.944695][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   78.951264][    C0] CR2: 0000561094e94138 CR3: 000000003472a000 CR4: 00000000003526f0
[   78.959224][    C0] Call Trace:
[   78.962488][    C0]  <IRQ>
[   78.965324][    C0]  dst_destroy+0x202/0x5a0
[   78.969728][    C0]  ? _raw_spin_unlock_irqrestore+0x30/0x80
[   78.975693][    C0]  ? rcu_core+0x751/0x1070
[   78.980104][    C0]  ? __pfx_dst_destroy_rcu+0x10/0x10
[   78.985381][    C0]  rcu_core+0x7cd/0x1070
[   78.989615][    C0]  ? __pfx_rcu_core+0x10/0x10
[   78.994277][    C0]  ? _raw_spin_unlock_irqrestore+0x4c/0x80
[   79.000075][    C0]  handle_softirqs+0x22a/0x870
[   79.004841][    C0]  ? do_softirq+0x76/0xd0
[   79.009165][    C0]  ? inet6_fill_ifla6_attrs+0x1150/0x25e0
[   79.014877][    C0]  do_softirq+0x76/0xd0
[   79.019025][    C0]  </IRQ>
[   79.021949][    C0]  <TASK>
[   79.024871][    C0]  __local_bh_enable_ip+0xf8/0x130
[   79.029987][    C0]  inet6_fill_ifla6_attrs+0x1150/0x25e0
[   79.035523][    C0]  ? __pfx_inet6_fill_ifla6_attrs+0x10/0x10
[   79.041400][    C0]  ? nla_put+0xd0/0x150
[   79.045548][    C0]  inet6_fill_link_af+0x9b/0x120
[   79.050473][    C0]  rtnl_fill_link_af+0x1c8/0x440
[   79.055405][    C0]  rtnl_fill_ifinfo+0x1e08/0x20f0
[   79.060421][    C0]  ? __pfx_rtnl_fill_ifinfo+0x10/0x10
[   79.065866][    C0]  ? __asan_memset+0x22/0x50
[   79.070449][    C0]  ? __nla_validate_parse+0x2480/0x2dc0
[   79.076021][    C0]  ? update_load_avg+0x1b0/0x1ec0
[   79.081042][    C0]  ? __lock_acquire+0x6b5/0x2cf0
[   79.085985][    C0]  ? xas_load+0x593/0x5b0
[   79.090308][    C0]  ? xa_find+0x25b/0x2b0
[   79.094537][    C0]  ? xa_find+0x8c/0x2b0
[   79.098679][    C0]  rtnl_dump_ifinfo+0xbb1/0x1180
[   79.103609][    C0]  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
[   79.109062][    C0]  ? __lock_acquire+0x6b5/0x2cf0
[   79.114005][    C0]  ? trace_kmalloc+0x2a/0x110
[   79.118667][    C0]  ? __kmalloc_node_track_caller_noprof+0x4f9/0x7b0
[   79.125243][    C0]  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
[   79.130598][    C0]  rtnl_dumpit+0xa2/0x200
[   79.134926][    C0]  netlink_dump+0x722/0xe80
[   79.139430][    C0]  ? __pfx_netlink_dump+0x10/0x10
[   79.144446][    C0]  ? __netlink_lookup+0x7e4/0x8b0
[   79.149544][    C0]  ? netlink_lookup+0x30/0x200
[   79.154297][    C0]  ? netlink_lookup+0x30/0x200
[   79.159049][    C0]  ? netlink_lookup+0x30/0x200
[   79.163803][    C0]  __netlink_dump_start+0x5cb/0x7e0
[   79.168994][    C0]  rtnetlink_rcv_msg+0xa3a/0xbe0
[   79.173914][    C0]  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
[   79.179273][    C0]  ? rtnetlink_rcv_msg+0x1b9/0xbe0
[   79.184367][    C0]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   79.189808][    C0]  ? __pfx_rtnl_dumpit+0x10/0x10
[   79.194727][    C0]  ? __pfx_rtnl_dump_ifinfo+0x10/0x10
[   79.200098][    C0]  netlink_rcv_skb+0x232/0x4b0
[   79.204854][    C0]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[   79.210320][    C0]  ? __pfx_netlink_rcv_skb+0x10/0x10
[   79.215638][    C0]  ? netlink_deliver_tap+0x2e/0x1b0
[   79.220896][    C0]  netlink_unicast+0x80f/0x9b0
[   79.225675][    C0]  ? __pfx_netlink_unicast+0x10/0x10
[   79.230970][    C0]  ? netlink_sendmsg+0x650/0xb40
[   79.235916][    C0]  ? skb_put+0x11b/0x210
[   79.240161][    C0]  netlink_sendmsg+0x813/0xb40
[   79.244919][    C0]  ? __pfx_netlink_sendmsg+0x10/0x10
[   79.250190][    C0]  ? tomoyo_socket_sendmsg_permission+0x1e0/0x300
[   79.256599][    C0]  ? __pfx_netlink_sendmsg+0x10/0x10
[   79.261870][    C0]  sock_sendmsg_nosec+0x18f/0x1d0
[   79.266890][    C0]  __sys_sendto+0x3ff/0x590
[   79.271378][    C0]  ? __pfx___sys_sendto+0x10/0x10
[   79.276392][    C0]  ? rcu_is_watching+0x15/0xb0
[   79.281149][    C0]  __x64_sys_sendto+0xde/0x100
[   79.285900][    C0]  do_syscall_64+0x14d/0xf80
[   79.290478][    C0]  ? trace_irq_disable+0x3b/0x150
[   79.295493][    C0]  ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   79.301544][    C0]  ? clear_bhb_loop+0x40/0x90
[   79.306206][    C0]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[   79.312116][    C0] RIP: 0033:0x7f45c626a407
[   79.316533][    C0] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
[   79.336123][    C0] RSP: 002b:00007fff5ef7c930 EFLAGS: 00000202 ORIG_RAX: 000000000000002c
[   79.344524][    C0] RAX: ffffffffffffffda RBX: 00007f45c61e0740 RCX: 00007f45c626a407
[   79.352489][    C0] RDX: 0000000000000014 RSI: 00007fff5ef7c9c0 RDI: 0000000000000016
[   79.360448][    C0] RBP: 00007fff5ef7c9a4 R08: 00007fff5ef7c9a4 R09: 000000000000000c
[   79.368403][    C0] R10: 0000000000000000 R11: 0000000000000202 R12: 00007fff5ef9d2b0
[   79.376357][    C0] R13: 00007f45c61e06c8 R14: 00007fff5ef7caa0 R15: 00007fff5ef8d080
[   79.384327][    C0]  </TASK>
[   79.387327][    C0] Modules linked in:
[   79.391216][    C0] ---[ end trace 0000000000000000 ]---
[   79.396657][    C0] RIP: 0010:__list_del_entry_valid_or_report+0xdf/0x190
[   79.403589][    C0] Code: 49 39 1f 0f 85 9e 00 00 00 b0 01 5b 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc cc 48 c7 c7 40 c1 29 8c 48 89 de e8 c2 29 65 fc 90 <0f> 0b 48 c7 c7 a0 c1 29 8c 48 89 de e8 b0 29 65 fc 90 0f 0b 4c 89
[   79.423200][    C0] RSP: 0018:ffffc90000007d58 EFLAGS: 00010046
[   79.429261][    C0] RAX: 0000000000000033 RBX: ffff88806e888490 RCX: f63d3b529a1a7600
[   79.437219][    C0] RDX: 0000000000000100 RSI: 0000000080000102 RDI: 0000000000000000
[   79.445172][    C0] RBP: 0000000000000203 R08: ffffc90000007ae7 R09: 1ffff92000000f5c
[   79.453132][    C0] R10: dffffc0000000000 R11: fffff52000000f5d R12: 1ffff1100dd11092
[   79.461084][    C0] R13: dffffc0000000000 R14: 0000000000000000 R15: 0000000000000000
[   79.469035][    C0] FS:  00007f45c61e0740(0000) GS:ffff888125009000(0000) knlGS:0000000000000000
[   79.477943][    C0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   79.484508][    C0] CR2: 0000561094e94138 CR3: 000000003472a000 CR4: 00000000003526f0
[   79.492468][    C0] Kernel panic - not syncing: Fatal exception in interrupt
[   79.500018][    C0] Kernel Offset: disabled
[   79.504325][    C0] Rebooting in 86400 seconds..


syzkaller build log:
go env (err=<nil>)
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE='auto'
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build1172046918=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/syzkaller/jobs/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOMODCACHE='/syzkaller/jobs/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/syzkaller/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'

git status (err=<nil>)
HEAD detached at 1e62d198252
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=1e62d1982527c3b4e18df04d61f2560fa1f434cc -X github.com/google/syzkaller/prog.gitRevisionDate=20260213-152336"  ./sys/syz-sysgen | grep -q false || go install -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=1e62d1982527c3b4e18df04d61f2560fa1f434cc -X github.com/google/syzkaller/prog.gitRevisionDate=20260213-152336"  ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=1e62d1982527c3b4e18df04d61f2560fa1f434cc -X github.com/google/syzkaller/prog.gitRevisionDate=20260213-152336"  -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
	-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include   -DGOOS_linux=1 -DGOARCH_amd64=1 \
	-DHOSTGOOS_linux=1 -DGIT_REVISION=\"1e62d1982527c3b4e18df04d61f2560fa1f434cc\"
/usr/bin/ld: /tmp/ccL3cRx2.o: in function `Connection::Connect(char const*, char const*)':
executor.cc:(.text._ZN10Connection7ConnectEPKcS1_[_ZN10Connection7ConnectEPKcS1_]+0x386): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
./tools/check-syzos.sh 2>/dev/null


Error text is too large and was truncated, full error text is at:
https://syzkaller.appspot.com/x/error.txt?x=11bac006580000


Tested on:

commit:         779cae95 Add linux-next specific files for 20260223
git tree:       linux-next
kernel config:  https://syzkaller.appspot.com/x/.config?x=ee920513e4deca5f
dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
patch:          https://syzkaller.appspot.com/x/patch.diff?x=15d8e55a580000


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-24  0:10 ` Hillf Danton
  2026-02-24  3:05   ` syzbot
@ 2026-02-24 10:00   ` Günther Noack
  1 sibling, 0 replies; 27+ messages in thread
From: Günther Noack @ 2026-02-24 10:00 UTC (permalink / raw)
  To: Hillf Danton; +Cc: syzbot, linux-kernel, syzkaller-bugs

On Tue, Feb 24, 2026 at 08:10:30AM +0800, Hillf Danton wrote:
> On Fri, Feb 20, 2026 at 03:11:21AM -0800
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> > dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> > compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000
> 
> #syz test
> 
> --- x/security/landlock/tsync.c
> +++ y/security/landlock/tsync.c
> @@ -540,11 +540,8 @@ int landlock_restrict_sibling_threads(co
>  		 * of for_each_thread().  We can reset it on each loop iteration because
>  		 * all previous loop iterations are done with it already.
>  		 *
> -		 * num_preparing is initialized to 1 so that the counter can not go to 0
> -		 * and mark the completion as done before all task works are registered.
> -		 * We decrement it at the end of the loop body.
>  		 */
> -		atomic_set(&shared_ctx.num_preparing, 1);
> +		atomic_set(&shared_ctx.num_preparing, 0);
>  		reinit_completion(&shared_ctx.all_prepared);
>  
>  		/*
> @@ -553,11 +550,7 @@ int landlock_restrict_sibling_threads(co
>  		 */
>  		found_more_threads = schedule_task_work(&works, &shared_ctx);
>  
> -		/*
> -		 * Decrement num_preparing for current, to undo that we initialized it
> -		 * to 1 a few lines above.
> -		 */
> -		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
> +		if (atomic_read(&shared_ctx.num_preparing) > 0) {
>  			if (wait_for_completion_interruptible(
>  				    &shared_ctx.all_prepared)) {
>  				/* In case of interruption, we need to retry the system call. */
> --

Hello Hillf!

Thanks for your contribution;

We have already analyzed the bug on an adjacent mail thread and have a
tentative patch of which we know it'll fix the issue:

https://lore.kernel.org/all/20260224062729.2908692-1-dingyihan@uniontech.com/

–Günther

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
  2026-02-23 13:40 ` Frederic Weisbecker
  2026-02-24  0:10 ` Hillf Danton
@ 2026-02-25  5:10 ` Hillf Danton
  2026-02-25 10:22 ` Hillf Danton
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-25  5:10 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -540,11 +540,8 @@ int landlock_restrict_sibling_threads(co
 		 * of for_each_thread().  We can reset it on each loop iteration because
 		 * all previous loop iterations are done with it already.
 		 *
-		 * num_preparing is initialized to 1 so that the counter can not go to 0
-		 * and mark the completion as done before all task works are registered.
-		 * We decrement it at the end of the loop body.
 		 */
-		atomic_set(&shared_ctx.num_preparing, 1);
+		atomic_set(&shared_ctx.num_preparing, 0);
 		reinit_completion(&shared_ctx.all_prepared);
 
 		/*
@@ -553,11 +550,7 @@ int landlock_restrict_sibling_threads(co
 		 */
 		found_more_threads = schedule_task_work(&works, &shared_ctx);
 
-		/*
-		 * Decrement num_preparing for current, to undo that we initialized it
-		 * to 1 a few lines above.
-		 */
-		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
+		if (atomic_read(&shared_ctx.num_preparing) > 0) {
 			if (wait_for_completion_interruptible(
 				    &shared_ctx.all_prepared)) {
 				/* In case of interruption, we need to retry the system call. */
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
                   ` (2 preceding siblings ...)
  2026-02-25  5:10 ` Hillf Danton
@ 2026-02-25 10:22 ` Hillf Danton
  2026-02-25 12:21 ` Hillf Danton
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-25 10:22 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test  upstream master

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
                   ` (3 preceding siblings ...)
  2026-02-25 10:22 ` Hillf Danton
@ 2026-02-25 12:21 ` Hillf Danton
  2026-02-25 22:32 ` Hillf Danton
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-25 12:21 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test  upstream master

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -502,11 +502,8 @@ int landlock_restrict_sibling_threads(co
 		 * of for_each_thread().  We can reset it on each loop iteration because
 		 * all previous loop iterations are done with it already.
 		 *
-		 * num_preparing is initialized to 1 so that the counter can not go to 0
-		 * and mark the completion as done before all task works are registered.
-		 * We decrement it at the end of the loop body.
 		 */
-		atomic_set(&shared_ctx.num_preparing, 1);
+		atomic_set(&shared_ctx.num_preparing, 0);
 		reinit_completion(&shared_ctx.all_prepared);
 
 		/*
@@ -515,11 +512,7 @@ int landlock_restrict_sibling_threads(co
 		 */
 		found_more_threads = schedule_task_work(&works, &shared_ctx);
 
-		/*
-		 * Decrement num_preparing for current, to undo that we initialized it
-		 * to 1 a few lines above.
-		 */
-		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
+		if (atomic_read(&shared_ctx.num_preparing) > 0) {
 			if (wait_for_completion_interruptible(
 				    &shared_ctx.all_prepared)) {
 				/* In case of interruption, we need to retry the system call. */
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
                   ` (4 preceding siblings ...)
  2026-02-25 12:21 ` Hillf Danton
@ 2026-02-25 22:32 ` Hillf Danton
  2026-02-26  2:19 ` Hillf Danton
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-25 22:32 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test  upstream master

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -441,7 +441,7 @@ int landlock_restrict_sibling_threads(co
 	atomic_set(&shared_ctx.preparation_error, 0);
 	init_completion(&shared_ctx.all_prepared);
 	init_completion(&shared_ctx.ready_to_commit);
-	atomic_set(&shared_ctx.num_unfinished, 1);
+	atomic_set(&shared_ctx.num_unfinished, 0);
 	init_completion(&shared_ctx.all_finished);
 	shared_ctx.old_cred = old_cred;
 	shared_ctx.new_cred = new_cred;
@@ -502,11 +502,8 @@ int landlock_restrict_sibling_threads(co
 		 * of for_each_thread().  We can reset it on each loop iteration because
 		 * all previous loop iterations are done with it already.
 		 *
-		 * num_preparing is initialized to 1 so that the counter can not go to 0
-		 * and mark the completion as done before all task works are registered.
-		 * We decrement it at the end of the loop body.
 		 */
-		atomic_set(&shared_ctx.num_preparing, 1);
+		atomic_set(&shared_ctx.num_preparing, 0);
 		reinit_completion(&shared_ctx.all_prepared);
 
 		/*
@@ -515,11 +512,7 @@ int landlock_restrict_sibling_threads(co
 		 */
 		found_more_threads = schedule_task_work(&works, &shared_ctx);
 
-		/*
-		 * Decrement num_preparing for current, to undo that we initialized it
-		 * to 1 a few lines above.
-		 */
-		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
+		if (atomic_read(&shared_ctx.num_preparing) > 0) {
 			if (wait_for_completion_interruptible(
 				    &shared_ctx.all_prepared)) {
 				/* In case of interruption, we need to retry the system call. */
@@ -548,11 +541,7 @@ int landlock_restrict_sibling_threads(co
 	 */
 	complete_all(&shared_ctx.ready_to_commit);
 
-	/*
-	 * Decrement num_unfinished for current, to undo that we initialized it to 1
-	 * at the beginning.
-	 */
-	if (atomic_dec_return(&shared_ctx.num_unfinished) > 0)
+	if (atomic_read(&shared_ctx.num_unfinished) > 0)
 		wait_for_completion(&shared_ctx.all_finished);
 
 	tsync_works_release(&works);
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
                   ` (5 preceding siblings ...)
  2026-02-25 22:32 ` Hillf Danton
@ 2026-02-26  2:19 ` Hillf Danton
  2026-02-26 10:04 ` Hillf Danton
  2026-02-27  0:03 ` Hillf Danton
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-26  2:19 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test  upstream master

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -391,7 +391,8 @@ static bool schedule_task_work(struct ts
 			ctx->task = NULL;
 
 			atomic_dec(&shared_ctx->num_preparing);
-			atomic_dec(&shared_ctx->num_unfinished);
+			if (atomic_dec_return(&shared_ctx->num_unfinished) == 0)
+				complete_all(&shared_ctx->all_finished);
 		}
 	}
 
@@ -441,7 +442,7 @@ int landlock_restrict_sibling_threads(co
 	atomic_set(&shared_ctx.preparation_error, 0);
 	init_completion(&shared_ctx.all_prepared);
 	init_completion(&shared_ctx.ready_to_commit);
-	atomic_set(&shared_ctx.num_unfinished, 1);
+	atomic_set(&shared_ctx.num_unfinished, 0);
 	init_completion(&shared_ctx.all_finished);
 	shared_ctx.old_cred = old_cred;
 	shared_ctx.new_cred = new_cred;
@@ -502,11 +503,8 @@ int landlock_restrict_sibling_threads(co
 		 * of for_each_thread().  We can reset it on each loop iteration because
 		 * all previous loop iterations are done with it already.
 		 *
-		 * num_preparing is initialized to 1 so that the counter can not go to 0
-		 * and mark the completion as done before all task works are registered.
-		 * We decrement it at the end of the loop body.
 		 */
-		atomic_set(&shared_ctx.num_preparing, 1);
+		atomic_set(&shared_ctx.num_preparing, 0);
 		reinit_completion(&shared_ctx.all_prepared);
 
 		/*
@@ -515,11 +513,7 @@ int landlock_restrict_sibling_threads(co
 		 */
 		found_more_threads = schedule_task_work(&works, &shared_ctx);
 
-		/*
-		 * Decrement num_preparing for current, to undo that we initialized it
-		 * to 1 a few lines above.
-		 */
-		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
+		if (atomic_read(&shared_ctx.num_preparing) > 0) {
 			if (wait_for_completion_interruptible(
 				    &shared_ctx.all_prepared)) {
 				/* In case of interruption, we need to retry the system call. */
@@ -548,11 +542,7 @@ int landlock_restrict_sibling_threads(co
 	 */
 	complete_all(&shared_ctx.ready_to_commit);
 
-	/*
-	 * Decrement num_unfinished for current, to undo that we initialized it to 1
-	 * at the beginning.
-	 */
-	if (atomic_dec_return(&shared_ctx.num_unfinished) > 0)
+	if (atomic_read(&shared_ctx.num_unfinished) > 0)
 		wait_for_completion(&shared_ctx.all_finished);
 
 	tsync_works_release(&works);
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
                   ` (6 preceding siblings ...)
  2026-02-26  2:19 ` Hillf Danton
@ 2026-02-26 10:04 ` Hillf Danton
  2026-02-27  0:03 ` Hillf Danton
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-26 10:04 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test  upstream master

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -391,7 +391,8 @@ static bool schedule_task_work(struct ts
 			ctx->task = NULL;
 
 			atomic_dec(&shared_ctx->num_preparing);
-			atomic_dec(&shared_ctx->num_unfinished);
+			if (atomic_dec_return(&shared_ctx->num_unfinished) == 0)
+				complete_all(&shared_ctx->all_finished);
 		}
 	}
 
@@ -432,16 +433,21 @@ static void cancel_tsync_works(struct ts
 int landlock_restrict_sibling_threads(const struct cred *old_cred,
 				      const struct cred *new_cred)
 {
+	static int concur = 0;
 	int err;
 	struct tsync_shared_context shared_ctx;
 	struct tsync_works works = {};
 	size_t newly_discovered_threads;
 	bool found_more_threads;
 
+	if (concur++) {
+		concur--;
+		return -EBUSY;
+	}
 	atomic_set(&shared_ctx.preparation_error, 0);
 	init_completion(&shared_ctx.all_prepared);
 	init_completion(&shared_ctx.ready_to_commit);
-	atomic_set(&shared_ctx.num_unfinished, 1);
+	atomic_set(&shared_ctx.num_unfinished, 0);
 	init_completion(&shared_ctx.all_finished);
 	shared_ctx.old_cred = old_cred;
 	shared_ctx.new_cred = new_cred;
@@ -502,11 +508,8 @@ int landlock_restrict_sibling_threads(co
 		 * of for_each_thread().  We can reset it on each loop iteration because
 		 * all previous loop iterations are done with it already.
 		 *
-		 * num_preparing is initialized to 1 so that the counter can not go to 0
-		 * and mark the completion as done before all task works are registered.
-		 * We decrement it at the end of the loop body.
 		 */
-		atomic_set(&shared_ctx.num_preparing, 1);
+		atomic_set(&shared_ctx.num_preparing, 0);
 		reinit_completion(&shared_ctx.all_prepared);
 
 		/*
@@ -515,11 +518,7 @@ int landlock_restrict_sibling_threads(co
 		 */
 		found_more_threads = schedule_task_work(&works, &shared_ctx);
 
-		/*
-		 * Decrement num_preparing for current, to undo that we initialized it
-		 * to 1 a few lines above.
-		 */
-		if (atomic_dec_return(&shared_ctx.num_preparing) > 0) {
+		if (atomic_read(&shared_ctx.num_preparing) > 0) {
 			if (wait_for_completion_interruptible(
 				    &shared_ctx.all_prepared)) {
 				/* In case of interruption, we need to retry the system call. */
@@ -548,14 +547,11 @@ int landlock_restrict_sibling_threads(co
 	 */
 	complete_all(&shared_ctx.ready_to_commit);
 
-	/*
-	 * Decrement num_unfinished for current, to undo that we initialized it to 1
-	 * at the beginning.
-	 */
-	if (atomic_dec_return(&shared_ctx.num_unfinished) > 0)
+	if (atomic_read(&shared_ctx.num_unfinished) > 0)
 		wait_for_completion(&shared_ctx.all_finished);
 
 	tsync_works_release(&works);
 
+	concur--;
 	return atomic_read(&shared_ctx.preparation_error);
 }
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback
  2026-02-20 11:11 syzbot
                   ` (7 preceding siblings ...)
  2026-02-26 10:04 ` Hillf Danton
@ 2026-02-27  0:03 ` Hillf Danton
  8 siblings, 0 replies; 27+ messages in thread
From: Hillf Danton @ 2026-02-27  0:03 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, syzkaller-bugs

On Fri, Feb 20, 2026 at 03:11:21AM -0800
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    635c467cc14e Add linux-next specific files for 20260213
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1452f6e6580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=61690c38d1398936
> dashboard link: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
> compiler:       Debian clang version 21.1.8 (++20251221033036+2078da43e25a-1~exp1~20251221153213.50), Debian LLD 21.1.8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16e41c02580000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15813652580000

#syz test  upstream master

--- x/security/landlock/tsync.c
+++ y/security/landlock/tsync.c
@@ -432,12 +432,17 @@ static void cancel_tsync_works(struct ts
 int landlock_restrict_sibling_threads(const struct cred *old_cred,
 				      const struct cred *new_cred)
 {
+	static int concur = 0;
 	int err;
 	struct tsync_shared_context shared_ctx;
 	struct tsync_works works = {};
 	size_t newly_discovered_threads;
 	bool found_more_threads;
 
+	if (concur++) {
+		concur--;
+		return -EBUSY;
+	}
 	atomic_set(&shared_ctx.preparation_error, 0);
 	init_completion(&shared_ctx.all_prepared);
 	init_completion(&shared_ctx.ready_to_commit);
@@ -556,6 +561,7 @@ int landlock_restrict_sibling_threads(co
 		wait_for_completion(&shared_ctx.all_finished);
 
 	tsync_works_release(&works);
+	concur--;
 
 	return atomic_read(&shared_ctx.preparation_error);
 }
--

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2026-02-27  0:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <F1E9B1BEBD8867CA+092eea18-94c7-4c65-a466-95cd3628a88c@uniontech.com>
2026-02-21  7:11 ` [syzbot] [kernel?] INFO: task hung in restrict_one_thread_callback syzbot
2026-02-21  7:28   ` Ding Yihan
2026-02-21 12:00     ` Günther Noack
2026-02-21 13:19       ` Günther Noack
2026-02-23  9:42         ` Günther Noack
2026-02-23 11:29           ` Ding Yihan
2026-02-23 15:16             ` Günther Noack
2026-02-24  3:02               ` Ding Yihan
2026-02-24  3:03                 ` syzbot
2026-02-24  6:27           ` [PATCH] landlock: Fix deadlock " Yihan Ding
2026-02-24  8:48             ` Günther Noack
2026-02-24 14:43     ` [syzbot] [kernel?] INFO: task hung " Günther Noack
     [not found] <7F45E41D790CD8A4+f1dcffc7-5b69-432f-8ad7-e96a3ef66219@uniontech.com>
2026-02-24  5:07 ` syzbot
     [not found] <F4E5EFD28BFB7AA6+108340fb-0592-4dd0-9f93-b7a2b760dc5d@uniontech.com>
2026-02-24  4:08 ` syzbot
2026-02-20 11:11 syzbot
2026-02-23 13:40 ` Frederic Weisbecker
2026-02-23 15:15   ` Günther Noack
2026-02-24  0:10 ` Hillf Danton
2026-02-24  3:05   ` syzbot
2026-02-24 10:00   ` Günther Noack
2026-02-25  5:10 ` Hillf Danton
2026-02-25 10:22 ` Hillf Danton
2026-02-25 12:21 ` Hillf Danton
2026-02-25 22:32 ` Hillf Danton
2026-02-26  2:19 ` Hillf Danton
2026-02-26 10:04 ` Hillf Danton
2026-02-27  0:03 ` Hillf Danton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.