public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
@ 2026-04-28  4:02 Deepanshu Kartikey
  2026-04-28 18:58 ` Viacheslav Dubeyko
  0 siblings, 1 reply; 7+ messages in thread
From: Deepanshu Kartikey @ 2026-04-28  4:02 UTC (permalink / raw)
  To: konishi.ryusuke, slava
  Cc: linux-nilfs, linux-kernel, Deepanshu Kartikey,
	syzbot+62f0f99d2f2bb8e3bbd7

Syzbot reported a hung task in nilfs_transaction_begin() where multiple
tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
waiting to acquire ns_segctor_sem for read:

  INFO: task syz.0.17:5918 blocked for more than 143 seconds.
  Call Trace:
   schedule+0x164/0x360
   rwsem_down_read_slowpath+0x6d9/0x940
   down_read+0x99/0x2e0
   nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
   nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
   notify_change+0xc1a/0xf40
   chmod_common+0x273/0x4a0
   do_fchmodat+0x12d/0x230

The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
caller, stuck inside printk while emitting per-element warnings from
nilfs_sufile_updatev():

   __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
   nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
   nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
   nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
   nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
   nilfs_segctor_do_construct+0x1f55/0x76c0
   nilfs_clean_segments+0x3bd/0xa50
   nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
   nilfs_ioctl+0x261f/0x2780

The root cause is that nilfs_ioctl_clean_segments() does not validate
the user-supplied segment numbers in kbufs[4] before calling
nilfs_clean_segments(), which acquires ns_segctor_sem for write.  The
range check on each segnum is performed deep inside the call chain by
nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
while still under the segctor lock and the sufile mi_sem.  Under load
(repeated invocations across multiple mounts saturating the global
printk path), the cumulative printk latency keeps ns_segctor_sem held
long enough to trip the hung_task watchdog, blocking concurrent
operations such as chmod() that need ns_segctor_sem for read.

Fix by validating the contents of kbufs[4] in the ioctl entry path,
before any FS-wide lock is acquired.  Out-of-range segment numbers are
rejected with -EINVAL synchronously, with no work performed under
ns_segctor_sem.

Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
 fs/nilfs2/ioctl.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
index e0a606643e87..38822dce1839 100644
--- a/fs/nilfs2/ioctl.c
+++ b/fs/nilfs2/ioctl.c
@@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
 	struct the_nilfs *nilfs;
 	size_t len, nsegs;
 	int n, ret;
+	size_t i;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
@@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
 	}
 	nilfs = inode->i_sb->s_fs_info;
 
+	/*
+	 * Validate segment numbers against the filesystem's segment count
+	 * before entering nilfs_clean_segments(), which acquires
+	 * ns_segctor_sem for write.  Catching invalid segnums here avoids
+	 * holding that lock while emitting per-element diagnostics under
+	 * the segment constructor.
+	 */
+	for (i = 0; i < nsegs; i++) {
+		if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
+			ret = -EINVAL;
+			kfree(kbufs[4]);
+			goto out;
+		}
+	}
+
 	for (n = 0; n < 4; n++) {
 		ret = -EINVAL;
 		if (argv[n].v_size != argsz[n])
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-04-28  4:02 [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers Deepanshu Kartikey
@ 2026-04-28 18:58 ` Viacheslav Dubeyko
  2026-04-29  1:50   ` Deepanshu Kartikey
  0 siblings, 1 reply; 7+ messages in thread
From: Viacheslav Dubeyko @ 2026-04-28 18:58 UTC (permalink / raw)
  To: Deepanshu Kartikey, konishi.ryusuke, slava
  Cc: linux-nilfs, linux-kernel, syzbot+62f0f99d2f2bb8e3bbd7

On Tue, 2026-04-28 at 09:32 +0530, Deepanshu Kartikey wrote:
> Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> waiting to acquire ns_segctor_sem for read:
> 
>   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
>   Call Trace:
>    schedule+0x164/0x360
>    rwsem_down_read_slowpath+0x6d9/0x940
>    down_read+0x99/0x2e0
>    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
>    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
>    notify_change+0xc1a/0xf40
>    chmod_common+0x273/0x4a0
>    do_fchmodat+0x12d/0x230
> 
> The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
> caller, stuck inside printk while emitting per-element warnings from
> nilfs_sufile_updatev():
> 
>    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
>    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
>    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
>    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
>    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
>    nilfs_segctor_do_construct+0x1f55/0x76c0
>    nilfs_clean_segments+0x3bd/0xa50
>    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
>    nilfs_ioctl+0x261f/0x2780
> 
> The root cause is that nilfs_ioctl_clean_segments() does not validate
> the user-supplied segment numbers in kbufs[4] before calling
> nilfs_clean_segments(), which acquires ns_segctor_sem for write.  The
> range check on each segnum is performed deep inside the call chain by
> nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> while still under the segctor lock and the sufile mi_sem.  Under load
> (repeated invocations across multiple mounts saturating the global
> printk path), the cumulative printk latency keeps ns_segctor_sem held
> long enough to trip the hung_task watchdog, blocking concurrent
> operations such as chmod() that need ns_segctor_sem for read.
> 
> Fix by validating the contents of kbufs[4] in the ioctl entry path,
> before any FS-wide lock is acquired.  Out-of-range segment numbers are
> rejected with -EINVAL synchronously, with no work performed under
> ns_segctor_sem.
> 
> Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> ---
>  fs/nilfs2/ioctl.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
> index e0a606643e87..38822dce1839 100644
> --- a/fs/nilfs2/ioctl.c
> +++ b/fs/nilfs2/ioctl.c
> @@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
>  	struct the_nilfs *nilfs;
>  	size_t len, nsegs;
>  	int n, ret;
> +	size_t i;

What about re-using the n variable? Does it make sense to introduce new one?

>  
>  	if (!capable(CAP_SYS_ADMIN))
>  		return -EPERM;
> @@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
>  	}
>  	nilfs = inode->i_sb->s_fs_info;
>  
> +	/*
> +	 * Validate segment numbers against the filesystem's segment count
> +	 * before entering nilfs_clean_segments(), which acquires
> +	 * ns_segctor_sem for write.  Catching invalid segnums here avoids
> +	 * holding that lock while emitting per-element diagnostics under
> +	 * the segment constructor.
> +	 */
> +	for (i = 0; i < nsegs; i++) {
> +		if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
> +			ret = -EINVAL;
> +			kfree(kbufs[4]);
> +			goto out;

Are you sure that you need to free buffer here and go to out? Maybe, we can
introduce another label and to jump to kfree(kbufs[4]) at the end of method?

Thanks,
Slava.

> +		}
> +	}
> +
>  	for (n = 0; n < 4; n++) {
>  		ret = -EINVAL;
>  		if (argv[n].v_size != argsz[n])


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-04-28 18:58 ` Viacheslav Dubeyko
@ 2026-04-29  1:50   ` Deepanshu Kartikey
  2026-04-29 12:31     ` Ryusuke Konishi
  0 siblings, 1 reply; 7+ messages in thread
From: Deepanshu Kartikey @ 2026-04-29  1:50 UTC (permalink / raw)
  To: Viacheslav Dubeyko
  Cc: konishi.ryusuke, slava, linux-nilfs, linux-kernel,
	syzbot+62f0f99d2f2bb8e3bbd7

On Wed, Apr 29, 2026 at 12:29 AM Viacheslav Dubeyko <vdubeyko@redhat.com> wrote:
>
> On Tue, 2026-04-28 at 09:32 +0530, Deepanshu Kartikey wrote:
> > Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> > tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> > waiting to acquire ns_segctor_sem for read:
> >
> >   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
> >   Call Trace:
> >    schedule+0x164/0x360
> >    rwsem_down_read_slowpath+0x6d9/0x940
> >    down_read+0x99/0x2e0
> >    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
> >    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
> >    notify_change+0xc1a/0xf40
> >    chmod_common+0x273/0x4a0
> >    do_fchmodat+0x12d/0x230
> >
> > The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
> > caller, stuck inside printk while emitting per-element warnings from
> > nilfs_sufile_updatev():
> >
> >    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
> >    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
> >    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
> >    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
> >    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
> >    nilfs_segctor_do_construct+0x1f55/0x76c0
> >    nilfs_clean_segments+0x3bd/0xa50
> >    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
> >    nilfs_ioctl+0x261f/0x2780
> >
> > The root cause is that nilfs_ioctl_clean_segments() does not validate
> > the user-supplied segment numbers in kbufs[4] before calling
> > nilfs_clean_segments(), which acquires ns_segctor_sem for write.  The
> > range check on each segnum is performed deep inside the call chain by
> > nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> > while still under the segctor lock and the sufile mi_sem.  Under load
> > (repeated invocations across multiple mounts saturating the global
> > printk path), the cumulative printk latency keeps ns_segctor_sem held
> > long enough to trip the hung_task watchdog, blocking concurrent
> > operations such as chmod() that need ns_segctor_sem for read.
> >
> > Fix by validating the contents of kbufs[4] in the ioctl entry path,
> > before any FS-wide lock is acquired.  Out-of-range segment numbers are
> > rejected with -EINVAL synchronously, with no work performed under
> > ns_segctor_sem.
> >
> > Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> > Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> > Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> > Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> > ---
> >  fs/nilfs2/ioctl.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
> > index e0a606643e87..38822dce1839 100644
> > --- a/fs/nilfs2/ioctl.c
> > +++ b/fs/nilfs2/ioctl.c
> > @@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> >       struct the_nilfs *nilfs;
> >       size_t len, nsegs;
> >       int n, ret;
> > +     size_t i;
>
> What about re-using the n variable? Does it make sense to introduce new one?
>
> >
> >       if (!capable(CAP_SYS_ADMIN))
> >               return -EPERM;
> > @@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> >       }
> >       nilfs = inode->i_sb->s_fs_info;
> >
> > +     /*
> > +      * Validate segment numbers against the filesystem's segment count
> > +      * before entering nilfs_clean_segments(), which acquires
> > +      * ns_segctor_sem for write.  Catching invalid segnums here avoids
> > +      * holding that lock while emitting per-element diagnostics under
> > +      * the segment constructor.
> > +      */
> > +     for (i = 0; i < nsegs; i++) {
> > +             if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
> > +                     ret = -EINVAL;
> > +                     kfree(kbufs[4]);
> > +                     goto out;
>
> Are you sure that you need to free buffer here and go to out? Maybe, we can
> introduce another label and to jump to kfree(kbufs[4]) at the end of method?
>
> Thanks,
> Slava.
>
> > +             }
> > +     }
> > +
> >       for (n = 0; n < 4; n++) {
> >               ret = -EINVAL;
> >               if (argv[n].v_size != argsz[n])
>

Thanks for the feedback. I have sent patch v2.

Thanks

Deepanshu Kartikey

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-04-29  1:50   ` Deepanshu Kartikey
@ 2026-04-29 12:31     ` Ryusuke Konishi
  2026-04-30  4:08       ` Deepanshu Kartikey
  0 siblings, 1 reply; 7+ messages in thread
From: Ryusuke Konishi @ 2026-04-29 12:31 UTC (permalink / raw)
  To: Deepanshu Kartikey
  Cc: Viacheslav Dubeyko, slava, linux-nilfs, linux-kernel,
	syzbot+62f0f99d2f2bb8e3bbd7

On Wed, Apr 29, 2026 at 10:50 AM Deepanshu Kartikey wrote:
>
> On Wed, Apr 29, 2026 at 12:29 AM Viacheslav Dubeyko wrote:
> >
> > On Tue, 2026-04-28 at 09:32 +0530, Deepanshu Kartikey wrote:
> > > Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> > > tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> > > waiting to acquire ns_segctor_sem for read:
> > >
> > >   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
> > >   Call Trace:
> > >    schedule+0x164/0x360
> > >    rwsem_down_read_slowpath+0x6d9/0x940
> > >    down_read+0x99/0x2e0
> > >    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
> > >    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
> > >    notify_change+0xc1a/0xf40
> > >    chmod_common+0x273/0x4a0
> > >    do_fchmodat+0x12d/0x230
> > >
> > > The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
> > > caller, stuck inside printk while emitting per-element warnings from
> > > nilfs_sufile_updatev():
> > >
> > >    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
> > >    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
> > >    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
> > >    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
> > >    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
> > >    nilfs_segctor_do_construct+0x1f55/0x76c0
> > >    nilfs_clean_segments+0x3bd/0xa50
> > >    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
> > >    nilfs_ioctl+0x261f/0x2780
> > >
> > > The root cause is that nilfs_ioctl_clean_segments() does not validate
> > > the user-supplied segment numbers in kbufs[4] before calling
> > > nilfs_clean_segments(), which acquires ns_segctor_sem for write.  The
> > > range check on each segnum is performed deep inside the call chain by
> > > nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> > > while still under the segctor lock and the sufile mi_sem.  Under load
> > > (repeated invocations across multiple mounts saturating the global
> > > printk path), the cumulative printk latency keeps ns_segctor_sem held
> > > long enough to trip the hung_task watchdog, blocking concurrent
> > > operations such as chmod() that need ns_segctor_sem for read.
> > >
> > > Fix by validating the contents of kbufs[4] in the ioctl entry path,
> > > before any FS-wide lock is acquired.  Out-of-range segment numbers are
> > > rejected with -EINVAL synchronously, with no work performed under
> > > ns_segctor_sem.
> > >
> > > Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> > > Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> > > Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> > > Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> > > ---
> > >  fs/nilfs2/ioctl.c | 16 ++++++++++++++++
> > >  1 file changed, 16 insertions(+)
> > >
> > > diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
> > > index e0a606643e87..38822dce1839 100644
> > > --- a/fs/nilfs2/ioctl.c
> > > +++ b/fs/nilfs2/ioctl.c
> > > @@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > >       struct the_nilfs *nilfs;
> > >       size_t len, nsegs;
> > >       int n, ret;
> > > +     size_t i;
> >
> > What about re-using the n variable? Does it make sense to introduce new one?
> >
> > >
> > >       if (!capable(CAP_SYS_ADMIN))
> > >               return -EPERM;
> > > @@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > >       }
> > >       nilfs = inode->i_sb->s_fs_info;
> > >
> > > +     /*
> > > +      * Validate segment numbers against the filesystem's segment count
> > > +      * before entering nilfs_clean_segments(), which acquires
> > > +      * ns_segctor_sem for write.  Catching invalid segnums here avoids
> > > +      * holding that lock while emitting per-element diagnostics under
> > > +      * the segment constructor.
> > > +      */
> > > +     for (i = 0; i < nsegs; i++) {
> > > +             if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
> > > +                     ret = -EINVAL;
> > > +                     kfree(kbufs[4]);
> > > +                     goto out;
> >
> > Are you sure that you need to free buffer here and go to out? Maybe, we can
> > introduce another label and to jump to kfree(kbufs[4]) at the end of method?
> >
> > Thanks,
> > Slava.
> >
> > > +             }
> > > +     }
> > > +
> > >       for (n = 0; n < 4; n++) {
> > >               ret = -EINVAL;
> > >               if (argv[n].v_size != argsz[n])
> >
>
> Thanks for the feedback. I have sent patch v2.
>
> Thanks
>
> Deepanshu Kartikey

Thank you, Deepanshu, for the patch proposal.

Because nilfs->ns_nsegments can be modified by nilfs_ioctl_resize(),
we must avoid race conditions regarding this proposed fix.

Therefore, it's appropriate to insert this check within the write lock
section of nilfs->ns_segctor_sem, that is, immediately after calling
nilfs_transaction_lock() within nilfs_clean_segments().

As a coding comment, directly referencing kbufs[4] as an array is not
very readable, so it's better to declare a variable in the local
variable declaration section of nilfs_clean_segments() like this:

        size_t i, nsegs = argv[4].v_nmembs;
        __u64 *segnumv = kbufs[4];

and then compare by referencing segnumv[i].
Here, the original variable name "nsegs" is confusingly similar to
"ns_nsegments", therefore, it would be better to simply change it to
"n" or rename it to something that more concisely represents the
number of segments in the array.

Also, to help identify the error's cause, I recommend adding an error
message like the following within the pre-check loop:

        nilfs_err(sb,
                "Segment number %llu to be freed is out of range",
                (unsigned long long)segnumv[i]);

Finally, a minor comment: to avoid scattering the function's exit path
when making corrections according to the above policy, it's better to
add a label like the following to nilfs_clean_segments() and jump to
it, rather than returning separately.

out_unlock:
        ...

bail_unlock:
        nilfs_transaction_unlock(sb);
        return err;

Thanks,
Ryusuke Konishi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-04-29 12:31     ` Ryusuke Konishi
@ 2026-04-30  4:08       ` Deepanshu Kartikey
  0 siblings, 0 replies; 7+ messages in thread
From: Deepanshu Kartikey @ 2026-04-30  4:08 UTC (permalink / raw)
  To: Ryusuke Konishi
  Cc: Viacheslav Dubeyko, slava, linux-nilfs, linux-kernel,
	syzbot+62f0f99d2f2bb8e3bbd7

On Wed, Apr 29, 2026 at 6:02 PM Ryusuke Konishi
<konishi.ryusuke@gmail.com> wrote:
>
> On Wed, Apr 29, 2026 at 10:50 AM Deepanshu Kartikey wrote:
> >
> > On Wed, Apr 29, 2026 at 12:29 AM Viacheslav Dubeyko wrote:
> > >
> > > On Tue, 2026-04-28 at 09:32 +0530, Deepanshu Kartikey wrote:
> > > > Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> > > > tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> > > > waiting to acquire ns_segctor_sem for read:
> > > >
> > > >   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
> > > >   Call Trace:
> > > >    schedule+0x164/0x360
> > > >    rwsem_down_read_slowpath+0x6d9/0x940
> > > >    down_read+0x99/0x2e0
> > > >    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
> > > >    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
> > > >    notify_change+0xc1a/0xf40
> > > >    chmod_common+0x273/0x4a0
> > > >    do_fchmodat+0x12d/0x230
> > > >
> > > > The writer holding ns_segctor_sem was a concurrent NILFS_IOCTL_CLEAN_SEGMENTS
> > > > caller, stuck inside printk while emitting per-element warnings from
> > > > nilfs_sufile_updatev():
> > > >
> > > >    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
> > > >    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
> > > >    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
> > > >    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
> > > >    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
> > > >    nilfs_segctor_do_construct+0x1f55/0x76c0
> > > >    nilfs_clean_segments+0x3bd/0xa50
> > > >    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
> > > >    nilfs_ioctl+0x261f/0x2780
> > > >
> > > > The root cause is that nilfs_ioctl_clean_segments() does not validate
> > > > the user-supplied segment numbers in kbufs[4] before calling
> > > > nilfs_clean_segments(), which acquires ns_segctor_sem for write.  The
> > > > range check on each segnum is performed deep inside the call chain by
> > > > nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> > > > while still under the segctor lock and the sufile mi_sem.  Under load
> > > > (repeated invocations across multiple mounts saturating the global
> > > > printk path), the cumulative printk latency keeps ns_segctor_sem held
> > > > long enough to trip the hung_task watchdog, blocking concurrent
> > > > operations such as chmod() that need ns_segctor_sem for read.
> > > >
> > > > Fix by validating the contents of kbufs[4] in the ioctl entry path,
> > > > before any FS-wide lock is acquired.  Out-of-range segment numbers are
> > > > rejected with -EINVAL synchronously, with no work performed under
> > > > ns_segctor_sem.
> > > >
> > > > Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> > > > Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> > > > Fixes: 4f6b828837b4 ("nilfs2: fix lock order reversal in nilfs_clean_segments ioctl")
> > > > Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> > > > Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> > > > ---
> > > >  fs/nilfs2/ioctl.c | 16 ++++++++++++++++
> > > >  1 file changed, 16 insertions(+)
> > > >
> > > > diff --git a/fs/nilfs2/ioctl.c b/fs/nilfs2/ioctl.c
> > > > index e0a606643e87..38822dce1839 100644
> > > > --- a/fs/nilfs2/ioctl.c
> > > > +++ b/fs/nilfs2/ioctl.c
> > > > @@ -846,6 +846,7 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > > >       struct the_nilfs *nilfs;
> > > >       size_t len, nsegs;
> > > >       int n, ret;
> > > > +     size_t i;
> > >
> > > What about re-using the n variable? Does it make sense to introduce new one?
> > >
> > > >
> > > >       if (!capable(CAP_SYS_ADMIN))
> > > >               return -EPERM;
> > > > @@ -876,6 +877,21 @@ static int nilfs_ioctl_clean_segments(struct inode *inode, struct file *filp,
> > > >       }
> > > >       nilfs = inode->i_sb->s_fs_info;
> > > >
> > > > +     /*
> > > > +      * Validate segment numbers against the filesystem's segment count
> > > > +      * before entering nilfs_clean_segments(), which acquires
> > > > +      * ns_segctor_sem for write.  Catching invalid segnums here avoids
> > > > +      * holding that lock while emitting per-element diagnostics under
> > > > +      * the segment constructor.
> > > > +      */
> > > > +     for (i = 0; i < nsegs; i++) {
> > > > +             if (((__u64 *)kbufs[4])[i] >= nilfs->ns_nsegments) {
> > > > +                     ret = -EINVAL;
> > > > +                     kfree(kbufs[4]);
> > > > +                     goto out;
> > >
> > > Are you sure that you need to free buffer here and go to out? Maybe, we can
> > > introduce another label and to jump to kfree(kbufs[4]) at the end of method?
> > >
> > > Thanks,
> > > Slava.
> > >
> > > > +             }
> > > > +     }
> > > > +
> > > >       for (n = 0; n < 4; n++) {
> > > >               ret = -EINVAL;
> > > >               if (argv[n].v_size != argsz[n])
> > >
> >
> > Thanks for the feedback. I have sent patch v2.
> >
> > Thanks
> >
> > Deepanshu Kartikey
>
> Thank you, Deepanshu, for the patch proposal.
>
> Because nilfs->ns_nsegments can be modified by nilfs_ioctl_resize(),
> we must avoid race conditions regarding this proposed fix.
>
> Therefore, it's appropriate to insert this check within the write lock
> section of nilfs->ns_segctor_sem, that is, immediately after calling
> nilfs_transaction_lock() within nilfs_clean_segments().
>
> As a coding comment, directly referencing kbufs[4] as an array is not
> very readable, so it's better to declare a variable in the local
> variable declaration section of nilfs_clean_segments() like this:
>
>         size_t i, nsegs = argv[4].v_nmembs;
>         __u64 *segnumv = kbufs[4];
>
> and then compare by referencing segnumv[i].
> Here, the original variable name "nsegs" is confusingly similar to
> "ns_nsegments", therefore, it would be better to simply change it to
> "n" or rename it to something that more concisely represents the
> number of segments in the array.
>
> Also, to help identify the error's cause, I recommend adding an error
> message like the following within the pre-check loop:
>
>         nilfs_err(sb,
>                 "Segment number %llu to be freed is out of range",
>                 (unsigned long long)segnumv[i]);
>
> Finally, a minor comment: to avoid scattering the function's exit path
> when making corrections according to the above policy, it's better to
> add a label like the following to nilfs_clean_segments() and jump to
> it, rather than returning separately.
>
> out_unlock:
>         ...
>
> bail_unlock:
>         nilfs_transaction_unlock(sb);
>         return err;
>
> Thanks,
> Ryusuke Konishi

Thanks for the detailed feedback, Ryusuke. I have sent the patch v3

Thanks


Deepanshu Kartikey

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
@ 2026-05-03  4:33 Ryusuke Konishi
  2026-05-04 21:14 ` Viacheslav Dubeyko
  0 siblings, 1 reply; 7+ messages in thread
From: Ryusuke Konishi @ 2026-05-03  4:33 UTC (permalink / raw)
  To: Viacheslav Dubeyko; +Cc: linux-nilfs, LKML, Deepanshu Kartikey

From: Deepanshu Kartikey <kartikey406@gmail.com>

Syzbot reported a hung task in nilfs_transaction_begin() where multiple
tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
waiting to acquire ns_segctor_sem for read:

  INFO: task syz.0.17:5918 blocked for more than 143 seconds.
  Call Trace:
   schedule+0x164/0x360
   rwsem_down_read_slowpath+0x6d9/0x940
   down_read+0x99/0x2e0
   nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
   nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
   notify_change+0xc1a/0xf40
   chmod_common+0x273/0x4a0
   do_fchmodat+0x12d/0x230

The writer holding ns_segctor_sem was a concurrent
NILFS_IOCTL_CLEAN_SEGMENTS caller, stuck inside printk while emitting
per-element warnings from nilfs_sufile_updatev():

   __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
   nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
   nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
   nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
   nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
   nilfs_segctor_do_construct+0x1f55/0x76c0
   nilfs_clean_segments+0x3bd/0xa50
   nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
   nilfs_ioctl+0x261f/0x2780

The root cause is that user-supplied segment numbers are not validated
before nilfs_clean_segments() begins doing work; the range check on
each segnum is performed deep inside the call chain by
nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
while still holding the segctor lock and the sufile mi_sem.  Under load
(repeated invocations across multiple mounts saturating the global
printk path), the cumulative printk latency keeps ns_segctor_sem held
long enough to trip the hung_task watchdog, blocking concurrent
operations such as chmod() that need ns_segctor_sem for read.

Fix by validating the contents of kbufs[4] in nilfs_clean_segments()
immediately after acquiring ns_segctor_sem via nilfs_transaction_lock().
Holding ns_segctor_sem serializes the check against
nilfs_ioctl_resize(), which can modify ns_nsegments, so the validation
uses a consistent value.  Out-of-range segment numbers are rejected
with -EINVAL before any segment-cleaning work begins, so the bad
entries never reach the per-element diagnostic path inside
nilfs_sufile_updatev().

Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
Fixes: 071cb4b81987 ("nilfs2: eliminate removal list of segments")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
---
Hi Viacheslav,

Please queue this patch.

This is a fix by Deepanshu that addresses the problem recently
detected by syzbot, a hang-up that can occur when GC ioctl parameters
are invalid (this time, when a segment number to be freed is
out-of-range).

Thanks,
Ryusuke Konishi

 fs/nilfs2/segment.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
index 1491a4d4b1e1..9332f5ac6083 100644
--- a/fs/nilfs2/segment.c
+++ b/fs/nilfs2/segment.c
@@ -2512,12 +2512,33 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
 	struct nilfs_sc_info *sci = nilfs->ns_writer;
 	struct nilfs_transaction_info ti;
 	int err;
+	size_t i, nfreesegs = argv[4].v_nmembs;
+	__u64 *segnumv = kbufs[4];
 
 	if (unlikely(!sci))
 		return -EROFS;
 
 	nilfs_transaction_lock(sb, &ti, 1);
 
+	/*
+	 * Validate segment numbers under ns_segctor_sem (held for write
+	 * by nilfs_transaction_lock above) so the check is serialized
+	 * against nilfs_ioctl_resize(), which can modify ns_nsegments.
+	 * Rejecting bad input here, before any segment-cleaning work
+	 * begins, avoids the per-element diagnostic path inside
+	 * nilfs_sufile_updatev() that would otherwise run under this
+	 * same lock and stall concurrent readers.
+	 */
+	for (i = 0; i < nfreesegs; i++) {
+		if (segnumv[i] >= nilfs->ns_nsegments) {
+			nilfs_err(sb,
+				 "Segment number %llu to be freed is out of range",
+				 (unsigned long long)segnumv[i]);
+			err = -EINVAL;
+			goto bail_unlock;
+		}
+	}
+
 	err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
 	if (unlikely(err))
 		goto out_unlock;
@@ -2558,6 +2579,7 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
 	sci->sc_freesegs = NULL;
 	sci->sc_nfreesegs = 0;
 	nilfs_mdt_clear_shadow_map(nilfs->ns_dat);
+ bail_unlock:
 	nilfs_transaction_unlock(sb);
 	return err;
 }
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers
  2026-05-03  4:33 Ryusuke Konishi
@ 2026-05-04 21:14 ` Viacheslav Dubeyko
  0 siblings, 0 replies; 7+ messages in thread
From: Viacheslav Dubeyko @ 2026-05-04 21:14 UTC (permalink / raw)
  To: Ryusuke Konishi, Viacheslav Dubeyko; +Cc: linux-nilfs, LKML, Deepanshu Kartikey

On Sun, 2026-05-03 at 13:33 +0900, Ryusuke Konishi wrote:
> From: Deepanshu Kartikey <kartikey406@gmail.com>
> 
> Syzbot reported a hung task in nilfs_transaction_begin() where multiple
> tasks performing chmod() on a nilfs2 mount blocked for over 143 seconds
> waiting to acquire ns_segctor_sem for read:
> 
>   INFO: task syz.0.17:5918 blocked for more than 143 seconds.
>   Call Trace:
>    schedule+0x164/0x360
>    rwsem_down_read_slowpath+0x6d9/0x940
>    down_read+0x99/0x2e0
>    nilfs_transaction_begin+0x364/0x710 fs/nilfs2/segment.c:221
>    nilfs_setattr+0x124/0x2c0 fs/nilfs2/inode.c:921
>    notify_change+0xc1a/0xf40
>    chmod_common+0x273/0x4a0
>    do_fchmodat+0x12d/0x230
> 
> The writer holding ns_segctor_sem was a concurrent
> NILFS_IOCTL_CLEAN_SEGMENTS caller, stuck inside printk while emitting
> per-element warnings from nilfs_sufile_updatev():
> 
>    __nilfs_msg+0x373/0x450 fs/nilfs2/super.c:78
>    nilfs_sufile_updatev+0x21c/0x6d0 fs/nilfs2/sufile.c:186
>    nilfs_sufile_freev fs/nilfs2/sufile.h:93 [inline]
>    nilfs_free_segments fs/nilfs2/segment.c:1140 [inline]
>    nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1261 [inline]
>    nilfs_segctor_do_construct+0x1f55/0x76c0
>    nilfs_clean_segments+0x3bd/0xa50
>    nilfs_ioctl_clean_segments fs/nilfs2/ioctl.c:922 [inline]
>    nilfs_ioctl+0x261f/0x2780
> 
> The root cause is that user-supplied segment numbers are not validated
> before nilfs_clean_segments() begins doing work; the range check on
> each segnum is performed deep inside the call chain by
> nilfs_sufile_updatev(), which emits a nilfs_warn() per invalid entry
> while still holding the segctor lock and the sufile mi_sem.  Under load
> (repeated invocations across multiple mounts saturating the global
> printk path), the cumulative printk latency keeps ns_segctor_sem held
> long enough to trip the hung_task watchdog, blocking concurrent
> operations such as chmod() that need ns_segctor_sem for read.
> 
> Fix by validating the contents of kbufs[4] in nilfs_clean_segments()
> immediately after acquiring ns_segctor_sem via nilfs_transaction_lock().
> Holding ns_segctor_sem serializes the check against
> nilfs_ioctl_resize(), which can modify ns_nsegments, so the validation
> uses a consistent value.  Out-of-range segment numbers are rejected
> with -EINVAL before any segment-cleaning work begins, so the bad
> entries never reach the per-element diagnostic path inside
> nilfs_sufile_updatev().
> 
> Reported-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=62f0f99d2f2bb8e3bbd7
> Tested-by: syzbot+62f0f99d2f2bb8e3bbd7@syzkaller.appspotmail.com
> Cc: stable@vger.kernel.org
> Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
> Fixes: 071cb4b81987 ("nilfs2: eliminate removal list of segments")
> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
> ---
> Hi Viacheslav,
> 
> Please queue this patch.
> 
> This is a fix by Deepanshu that addresses the problem recently
> detected by syzbot, a hang-up that can occur when GC ioctl parameters
> are invalid (this time, when a segment number to be freed is
> out-of-range).
> 
> Thanks,
> Ryusuke Konishi
> 
>  fs/nilfs2/segment.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/fs/nilfs2/segment.c b/fs/nilfs2/segment.c
> index 1491a4d4b1e1..9332f5ac6083 100644
> --- a/fs/nilfs2/segment.c
> +++ b/fs/nilfs2/segment.c
> @@ -2512,12 +2512,33 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
>  	struct nilfs_sc_info *sci = nilfs->ns_writer;
>  	struct nilfs_transaction_info ti;
>  	int err;
> +	size_t i, nfreesegs = argv[4].v_nmembs;
> +	__u64 *segnumv = kbufs[4];
>  
>  	if (unlikely(!sci))
>  		return -EROFS;
>  
>  	nilfs_transaction_lock(sb, &ti, 1);
>  
> +	/*
> +	 * Validate segment numbers under ns_segctor_sem (held for write
> +	 * by nilfs_transaction_lock above) so the check is serialized
> +	 * against nilfs_ioctl_resize(), which can modify ns_nsegments.
> +	 * Rejecting bad input here, before any segment-cleaning work
> +	 * begins, avoids the per-element diagnostic path inside
> +	 * nilfs_sufile_updatev() that would otherwise run under this
> +	 * same lock and stall concurrent readers.
> +	 */
> +	for (i = 0; i < nfreesegs; i++) {
> +		if (segnumv[i] >= nilfs->ns_nsegments) {
> +			nilfs_err(sb,
> +				 "Segment number %llu to be freed is out of range",
> +				 (unsigned long long)segnumv[i]);
> +			err = -EINVAL;
> +			goto bail_unlock;
> +		}
> +	}
> +
>  	err = nilfs_mdt_save_to_shadow_map(nilfs->ns_dat);
>  	if (unlikely(err))
>  		goto out_unlock;
> @@ -2558,6 +2579,7 @@ int nilfs_clean_segments(struct super_block *sb, struct nilfs_argv *argv,
>  	sci->sc_freesegs = NULL;
>  	sci->sc_nfreesegs = 0;
>  	nilfs_mdt_clear_shadow_map(nilfs->ns_dat);
> + bail_unlock:
>  	nilfs_transaction_unlock(sb);
>  	return err;
>  }

Applied.

Thanks,
Slava.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-05-04 21:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28  4:02 [PATCH] nilfs2: reject CLEAN_SEGMENTS ioctl with out-of-range segment numbers Deepanshu Kartikey
2026-04-28 18:58 ` Viacheslav Dubeyko
2026-04-29  1:50   ` Deepanshu Kartikey
2026-04-29 12:31     ` Ryusuke Konishi
2026-04-30  4:08       ` Deepanshu Kartikey
  -- strict thread matches above, loose matches on Subject: below --
2026-05-03  4:33 Ryusuke Konishi
2026-05-04 21:14 ` Viacheslav Dubeyko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox