Linux NILFS development
 help / color / mirror / Atom feed
* [PATCH v3] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
@ 2026-04-30  4:07 Deepanshu Kartikey
  2026-04-30 12:13 ` Ryusuke Konishi
  2026-04-30 18:11 ` Viacheslav Dubeyko
  0 siblings, 2 replies; 3+ messages in thread
From: Deepanshu Kartikey @ 2026-04-30  4:07 UTC (permalink / raw)
  To: konishi.ryusuke, slava
  Cc: linux-nilfs, linux-kernel, Deepanshu Kartikey,
	syzbot+62f0f99d2f2bb8e3bbd7, stable

Syzbot reported a hung task in nilfs_transaction_begin() where multiple
tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
waiting to acquire ns_segctor_sem for read:

  INFO: task syz.0.17:5918 blocked for more than 143 seconds.
  Call Trace:
   schedule+0x164/0x360
   rwsem_down_read_slowpath+0x6d9/0x940
   down_read+0x99/0x2e0
   nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
   nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
   notify_change+0xc1a/0xf40
   chmod_common+0x273/0x4a0
   do_fchmodat+0x12d/0x230

The writer holding ns_segctor_sem was a concurrent 
NILFS_IOCTL_CLEAN_SEGMENTS caller, stuck inside printk while emitting 
per-element warnings from nilfs_sufile_updatev():

   __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
   nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
   nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
   nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
   nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
   nilfs_segctor_do_construct+0x1f55/0x76c0
   nilfs_clean_segments+0x3bd/0xa50
   nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
   nilfs_ioctl+0x261f/0x2780

The root cause is that user-supplied segment numbers are not validated
before nilfs_clean_segments() begins doing work; the range check on
each segnum is performed deep inside the call chain by
nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
while still holding the segctor lock and the sufile mi_sem.  Under load
(repeated invocations across multiple mounts saturating the global
printk path), the cumulative printk latency keeps ns_segctor_sem held
long enough to trip the hung_task watchdog, blocking concurrent
operations such as chmod() that need ns_segctor_sem for read.

Fix by validating the contents of kbufs[4] in nilfs_clean_segments()
immediately after acquiring ns_segctor_sem via nilfs_transaction_lock().
Holding ns_segctor_sem serializes the check against
nilfs_ioctl_resize(), which can modify ns_nsegments, so the validation
uses a consistent value.  Out-of-range segment numbers are rejected
with -EINVAL before any segment-cleaning work begins, so the bad
entries never reach the per-element diagnostic path inside
nilfs_sufile_updatev().

Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
Cc: stable@vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
Changes in v3:
  - Move validation from nilfs_ioctl_clean_segments() into
    nilfs_clean_segments(), under ns_segctor_sem held for write
    by nilfs_transaction_lock(), to serialize against
    nilfs_ioctl_resize() which can modify ns_nsegments
    (Ryusuke Konishi)
  - Introduce local variables segnumv and nfreesegs for readability,
    rather than open-coding casts of kbufs[4] (Ryusuke Konishi)
  - Emit nilfs_err() once on the first out-of-range segnum and bail
    out, instead of nilfs_warn() per element (Ryusuke Konishi)
  - Add bail_unlock label for the early-failure path, parallel to
    the existing out_unlock structure (Ryusuke Konishi)

Changes in v2:
  - Reuse existing 'n' loop variable instead of introducing a new
    one (Slava Dubeyko)
  - Add dedicated out_free_segnums label so the validation-failure
    path falls through the existing cleanup ladder rather than
    duplicating kfree(kbufs[4]) inline (Slava Dubeyko)
---
 fs/nilfs2/segment.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 1491a4d4b1e1..dc54643866ce 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2512,12 +2512,33 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
 	struct nilfs_sc_info *sci = nilfs->ns_writer;
 	struct nilfs_transaction_info ti;
 	int err;
+	size_t i, nfreesegs = argv[4].v_nmembs;
+	__u64 *segnumv = kbufs[4];
 
 	if (unlikely(!sci))
 		return -EROFS;
 
 	nilfs_transaction_lock(sb, &ti, 1);
 
+	/*
+	 * Validate segment numbers under ns_segctor_sem (held for write
+	 * by nilfs_transaction_lock above) so the check is serialized
+	 * against nilfs_ioctl_resize(), which can modify ns_nsegments.
+	 * Rejecting bad input here, before any segment-cleaning work
+	 * begins, avoids the per-element diagnostic path inside
+	 * nilfs_sufile_updatev() that would otherwise run under this
+	 * same lock and stall concurrent readers.
+	 */
+	for (i = 0; i < nfreesegs; i++) {
+		if (segnumv[i] >= nilfs->ns_nsegments) {
+			nilfs_err(sb,
+				 "Segment number %llu to be freed is out of range",
+				 (unsigned long long)segnumv[i]);
+			err = -EINVAL;
+			goto bail_unlock;
+		}
+	}
+
 	err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
 	if (unlikely(err))
 		goto out_unlock;
@@ -2558,6 +2579,7 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
 	sci->sc_freesegs = NULL;
 	sci->sc_nfreesegs = 0;
 	nilfs_mdt_clear_shadow_map(nilfs->ns_dat);
+ bail_unlock:
 	nilfs_transaction_unlock(sb);
 	return err;
 }
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-04-30  4:07 [PATCH v3] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers Deepanshu Kartikey
@ 2026-04-30 12:13 ` Ryusuke Konishi
  2026-04-30 18:11 ` Viacheslav Dubeyko
  1 sibling, 0 replies; 3+ messages in thread
From: Ryusuke Konishi @ 2026-04-30 12:13 UTC (permalink / raw)
  To: Deepanshu Kartikey
  Cc: slava, linux-nilfs, linux-kernel, syzbot+62f0f99d2f2bb8e3bbd7,
	stable

On Thu, Apr 30, 2026 at 1:07 PM Deepanshu Kartikey wrote:
>
> Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> waiting to acquire ns_segctor_sem for read:
>
>   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
>   Call Trace:
>    schedule+0x164/0x360
>    rwsem_down_read_slowpath+0x6d9/0x940
>    down_read+0x99/0x2e0
>    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
>    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
>    notify_change+0xc1a/0xf40
>    chmod_common+0x273/0x4a0
>    do_fchmodat+0x12d/0x230
>
> The writer holding ns_segctor_sem was a concurrent
> NILFS_IOCTL_CLEAN_SEGMENTS caller, stuck inside printk while emitting
> per-element warnings from nilfs_sufile_updatev():
>
>    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
>    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
>    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
>    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
>    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
>    nilfs_segctor_do_construct+0x1f55/0x76c0
>    nilfs_clean_segments+0x3bd/0xa50
>    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
>    nilfs_ioctl+0x261f/0x2780
>
> The root cause is that user-supplied segment numbers are not validated
> before nilfs_clean_segments() begins doing work; the range check on
> each segnum is performed deep inside the call chain by
> nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> while still holding the segctor lock and the sufile mi_sem.  Under load
> (repeated invocations across multiple mounts saturating the global
> printk path), the cumulative printk latency keeps ns_segctor_sem held
> long enough to trip the hung_task watchdog, blocking concurrent
> operations such as chmod() that need ns_segctor_sem for read.
>
> Fix by validating the contents of kbufs[4] in nilfs_clean_segments()
> immediately after acquiring ns_segctor_sem via nilfs_transaction_lock().
> Holding ns_segctor_sem serializes the check against
> nilfs_ioctl_resize(), which can modify ns_nsegments, so the validation
> uses a consistent value.  Out-of-range segment numbers are rejected
> with -EINVAL before any segment-cleaning work begins, so the bad
> entries never reach the per-element diagnostic path inside
> nilfs_sufile_updatev().
>
> Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")

The cause appears to be commit 071cb4b81987 ("nilfs2: eliminate
removal list of segments"), which removed the segment release logic
that used a list of segment information structures.
Prior to that, the validity check of segment numbers was performed
within nilfs_ioctl_prepare_clean_segments().

Everything else seems OK, so I'll fix only that tag myself, perform a
final check, and then send it upstream.

Thanks,
Ryusuke Konishi

> Cc: stable@vger.kernel.org
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
> Changes in v3:
>   - Move validation from nilfs_ioctl_clean_segments() into
>     nilfs_clean_segments(), under ns_segctor_sem held for write
>     by nilfs_transaction_lock(), to serialize against
>     nilfs_ioctl_resize() which can modify ns_nsegments
>     (Ryusuke Konishi)
>   - Introduce local variables segnumv and nfreesegs for readability,
>     rather than open-coding casts of kbufs[4] (Ryusuke Konishi)
>   - Emit nilfs_err() once on the first out-of-range segnum and bail
>     out, instead of nilfs_warn() per element (Ryusuke Konishi)
>   - Add bail_unlock label for the early-failure path, parallel to
>     the existing out_unlock structure (Ryusuke Konishi)
>
> Changes in v2:
>   - Reuse existing 'n' loop variable instead of introducing a new
>     one (Slava Dubeyko)
>   - Add dedicated out_free_segnums label so the validation-failure
>     path falls through the existing cleanup ladder rather than
>     duplicating kfree(kbufs[4]) inline (Slava Dubeyko)
> ---
>  fs/nilfs2/segment.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> index 1491a4d4b1e1..dc54643866ce 100644
> --- a/fs/nilfs2/segment.c
> +++ b/fs/nilfs2/segment.c
> @@ -2512,12 +2512,33 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
>         struct nilfs_sc_info *sci = nilfs->ns_writer;
>         struct nilfs_transaction_info ti;
>         int err;
> +       size_t i, nfreesegs = argv[4].v_nmembs;
> +       __u64 *segnumv = kbufs[4];
>
>         if (unlikely(!sci))
>                 return -EROFS;
>
>         nilfs_transaction_lock(sb, &ti, 1);
>
> +       /*
> +        * Validate segment numbers under ns_segctor_sem (held for write
> +        * by nilfs_transaction_lock above) so the check is serialized
> +        * against nilfs_ioctl_resize(), which can modify ns_nsegments.
> +        * Rejecting bad input here, before any segment-cleaning work
> +        * begins, avoids the per-element diagnostic path inside
> +        * nilfs_sufile_updatev() that would otherwise run under this
> +        * same lock and stall concurrent readers.
> +        */
> +       for (i = 0; i < nfreesegs; i++) {
> +               if (segnumv[i] >= nilfs->ns_nsegments) {
> +                       nilfs_err(sb,
> +                                "Segment number %llu to be freed is out of range",
> +                                (unsigned long long)segnumv[i]);
> +                       err = -EINVAL;
> +                       goto bail_unlock;
> +               }
> +       }
> +
>         err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
>         if (unlikely(err))
>                 goto out_unlock;
> @@ -2558,6 +2579,7 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
>         sci->sc_freesegs = NULL;
>         sci->sc_nfreesegs = 0;
>         nilfs_mdt_clear_shadow_map(nilfs->ns_dat);
> + bail_unlock:
>         nilfs_transaction_unlock(sb);
>         return err;
>  }
> --
> 2.43.0
>
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v3] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-04-30  4:07 [PATCH v3] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers Deepanshu Kartikey
  2026-04-30 12:13 ` Ryusuke Konishi
@ 2026-04-30 18:11 ` Viacheslav Dubeyko
  1 sibling, 0 replies; 3+ messages in thread
From: Viacheslav Dubeyko @ 2026-04-30 18:11 UTC (permalink / raw)
  To: Deepanshu Kartikey, konishi.ryusuke, slava
  Cc: linux-nilfs, linux-kernel, syzbot+62f0f99d2f2bb8e3bbd7, stable

On Thu, 2026-04-30 at 09:37 +0530, Deepanshu Kartikey wrote:
> Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> waiting to acquire ns_segctor_sem for read:
> 
>   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
>   Call Trace:
>    schedule+0x164/0x360
>    rwsem_down_read_slowpath+0x6d9/0x940
>    down_read+0x99/0x2e0
>    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
>    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
>    notify_change+0xc1a/0xf40
>    chmod_common+0x273/0x4a0
>    do_fchmodat+0x12d/0x230
> 
> The writer holding ns_segctor_sem was a concurrent 
> NILFS_IOCTL_CLEAN_SEGMENTS caller, stuck inside printk while emitting 
> per-element warnings from nilfs_sufile_updatev():
> 
>    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
>    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
>    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
>    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
>    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
>    nilfs_segctor_do_construct+0x1f55/0x76c0
>    nilfs_clean_segments+0x3bd/0xa50
>    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
>    nilfs_ioctl+0x261f/0x2780
> 
> The root cause is that user-supplied segment numbers are not validated
> before nilfs_clean_segments() begins doing work; the range check on
> each segnum is performed deep inside the call chain by
> nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> while still holding the segctor lock and the sufile mi_sem.  Under load
> (repeated invocations across multiple mounts saturating the global
> printk path), the cumulative printk latency keeps ns_segctor_sem held
> long enough to trip the hung_task watchdog, blocking concurrent
> operations such as chmod() that need ns_segctor_sem for read.
> 
> Fix by validating the contents of kbufs[4] in nilfs_clean_segments()
> immediately after acquiring ns_segctor_sem via nilfs_transaction_lock().
> Holding ns_segctor_sem serializes the check against
> nilfs_ioctl_resize(), which can modify ns_nsegments, so the validation
> uses a consistent value.  Out-of-range segment numbers are rejected
> with -EINVAL before any segment-cleaning work begins, so the bad
> entries never reach the per-element diagnostic path inside
> nilfs_sufile_updatev().
> 
> Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> Cc: stable@vger.kernel.org
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
> Changes in v3:
>   - Move validation from nilfs_ioctl_clean_segments() into
>     nilfs_clean_segments(), under ns_segctor_sem held for write
>     by nilfs_transaction_lock(), to serialize against
>     nilfs_ioctl_resize() which can modify ns_nsegments
>     (Ryusuke Konishi)
>   - Introduce local variables segnumv and nfreesegs for readability,
>     rather than open-coding casts of kbufs[4] (Ryusuke Konishi)
>   - Emit nilfs_err() once on the first out-of-range segnum and bail
>     out, instead of nilfs_warn() per element (Ryusuke Konishi)
>   - Add bail_unlock label for the early-failure path, parallel to
>     the existing out_unlock structure (Ryusuke Konishi)
> 
> Changes in v2:
>   - Reuse existing 'n' loop variable instead of introducing a new
>     one (Slava Dubeyko)
>   - Add dedicated out_free_segnums label so the validation-failure
>     path falls through the existing cleanup ladder rather than
>     duplicating kfree(kbufs[4]) inline (Slava Dubeyko)
> ---
>  fs/nilfs2/segment.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> index 1491a4d4b1e1..dc54643866ce 100644
> --- a/fs/nilfs2/segment.c
> +++ b/fs/nilfs2/segment.c
> @@ -2512,12 +2512,33 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
>  	struct nilfs_sc_info *sci = nilfs->ns_writer;
>  	struct nilfs_transaction_info ti;
>  	int err;

Usually, I prefer to keep the err variable at the end of declarations. Because,
it is the ending state of the function. And I am feeling that something is wrong
every time when likewise variable is hidden inside of declaration list. :) There
is nothing critical in my remark. But anyway... :)

The path looks good to me.

Thanks,
Slava.

> +	size_t i, nfreesegs = argv[4].v_nmembs;
> +	__u64 *segnumv = kbufs[4];
>  
>  	if (unlikely(!sci))
>  		return -EROFS;
>  
>  	nilfs_transaction_lock(sb, &ti, 1);
>  
> +	/*
> +	 * Validate segment numbers under ns_segctor_sem (held for write
> +	 * by nilfs_transaction_lock above) so the check is serialized
> +	 * against nilfs_ioctl_resize(), which can modify ns_nsegments.
> +	 * Rejecting bad input here, before any segment-cleaning work
> +	 * begins, avoids the per-element diagnostic path inside
> +	 * nilfs_sufile_updatev() that would otherwise run under this
> +	 * same lock and stall concurrent readers.
> +	 */
> +	for (i = 0; i < nfreesegs; i++) {
> +		if (segnumv[i] >= nilfs->ns_nsegments) {
> +			nilfs_err(sb,
> +				 "Segment number %llu to be freed is out of range",
> +				 (unsigned long long)segnumv[i]);
> +			err = -EINVAL;
> +			goto bail_unlock;
> +		}
> +	}
> +
>  	err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
>  	if (unlikely(err))
>  		goto out_unlock;
> @@ -2558,6 +2579,7 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
>  	sci->sc_freesegs = NULL;
>  	sci->sc_nfreesegs = 0;
>  	nilfs_mdt_clear_shadow_map(nilfs->ns_dat);
> + bail_unlock:
>  	nilfs_transaction_unlock(sb);
>  	return err;
>  }


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-30 18:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-30  4:07 [PATCH v3] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers Deepanshu Kartikey
2026-04-30 12:13 ` Ryusuke Konishi
2026-04-30 18:11 ` Viacheslav Dubeyko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox