* [syzbot] possible deadlock in ntfs_fiemap
@ 2022-11-30 12:51 syzbot
2022-12-09 4:00 ` syzbot
0 siblings, 1 reply; 9+ messages in thread
From: syzbot @ 2022-11-30 12:51 UTC (permalink / raw)
To: almaz.alexandrovich, linux-kernel, llvm, nathan, ndesaulniers,
ntfs3, syzkaller-bugs, trix
Hello,
syzbot found the following issue on:
HEAD commit: 01f856ae6d0c Merge tag 'net-6.1-rc8-2' of git://git.kernel..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15bc5fc3880000
kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
dashboard link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/5428d604f56a/disk-01f856ae.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/e953d290d254/vmlinux-01f856ae.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3f71610a4904/bzImage-01f856ae.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+96cee7d33ca3f87eee86@syzkaller.appspotmail.com
loop5: detected capacity change from 0 to 4096
ntfs3: loop5: Different NTFS' sector size (2048) and media sector size (512)
======================================================
WARNING: possible circular locking dependency detected
6.1.0-rc7-syzkaller-00101-g01f856ae6d0c #0 Not tainted
------------------------------------------------------
syz-executor.5/25213 is trying to acquire lock:
ffff88801d328fd8 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x8f/0x110 mm/memory.c:5645
but task is already holding lock:
ffff88801ebd34a0 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
ffff88801ebd34a0 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (&ni->ni_lock/4){+.+.}-{3:3}
:
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__mutex_lock_common+0x1bd/0x26e0 kernel/locking/mutex.c:603
__mutex_lock kernel/locking/mutex.c:747 [inline]
mutex_lock_nested+0x17/0x20 kernel/locking/mutex.c:799
ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
attr_data_get_block+0x301/0x2370 fs/ntfs3/attrib.c:917
ntfs_file_mmap+0x48c/0x730 fs/ntfs3/file.c:387
call_mmap include/linux/fs.h:2204 [inline]
mmap_region+0xfe6/0x1e20 mm/mmap.c:2625
do_mmap+0x8d9/0xf30 mm/mmap.c:1412
vm_mmap_pgoff+0x19e/0x2b0 mm/util.c:520
ksys_mmap_pgoff+0x48c/0x6d0 mm/mmap.c:1458
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
-> #0 (&mm->mmap_lock#2){++++}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3097 [inline]
check_prevs_add kernel/locking/lockdep.c:3216 [inline]
validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831
__lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__might_fault+0xb2/0x110 mm/memory.c:5646
_copy_to_user+0x26/0x130 lib/usercopy.c:29
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934
ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&ni->ni_lock/4);
lock(&mm->mmap_lock#2);
lock(&ni->ni_lock/4);
lock(&mm->mmap_lock#2
);
*** DEADLOCK ***
1 lock held by syz-executor.5/25213:
#0: ffff88801ebd34a0 (&ni->ni_lock
/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243
stack backtrace:
CPU: 0 PID: 25213 Comm: syz-executor.5 Not tainted 6.1.0-rc7-syzkaller-00101-g01f856ae6d0c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106
check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2177
check_prev_add kernel/locking/lockdep.c:3097 [inline]
check_prevs_add kernel/locking/lockdep.c:3216 [inline]
validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831
__lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__might_fault+0xb2/0x110 mm/memory.c:5646
_copy_to_user+0x26/0x130 lib/usercopy.c:29
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934
ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f692648c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6927202168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f69265abf80 RCX: 00007f692648c0d9
RDX: 0000000020000240 RSI: 00000000c020660b RDI: 0000000000000004
RBP: 00007f69264e7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe980edc3f R14: 00007f6927202300 R15: 0000000000022000
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [syzbot] possible deadlock in ntfs_fiemap 2022-11-30 12:51 [syzbot] possible deadlock in ntfs_fiemap syzbot @ 2022-12-09 4:00 ` syzbot 2023-04-12 13:11 ` [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() Tetsuo Handa 0 siblings, 1 reply; 9+ messages in thread From: syzbot @ 2022-12-09 4:00 UTC (permalink / raw) To: almaz.alexandrovich, linux-kernel, llvm, nathan, ndesaulniers, ntfs3, syzkaller-bugs, trix syzbot has found a reproducer for the following issue on: HEAD commit: f3e8416619ce Merge tag 'soc-fixes-6.1-5' of git://git.kern.. git tree: upstream console+strace: https://syzkaller.appspot.com/x/log.txt?x=10e21467880000 kernel config: https://syzkaller.appspot.com/x/.config?x=d58e7fe7f9cf5e24 dashboard link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86 compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=116f8843880000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11646857880000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/628abc27cbe7/disk-f3e84166.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/2f19ea836174/vmlinux-f3e84166.xz kernel image: https://storage.googleapis.com/syzbot-assets/f2e1347e85a5/bzImage-f3e84166.xz mounted in repro: https://storage.googleapis.com/syzbot-assets/d38a9877608a/mount_0.gz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+96cee7d33ca3f87eee86@syzkaller.appspotmail.com ====================================================== WARNING: possible circular locking dependency detected 6.1.0-rc8-syzkaller-00035-gf3e8416619ce #0 Not tainted ------------------------------------------------------ syz-executor145/3632 is trying to acquire lock: ffff88801981bb58 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x8f/0x110 mm/memory.c:5644 but task is already holding lock: ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline] ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&ni->ni_lock/4){+.+.}-{3:3}: lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668 __mutex_lock_common+0x1bd/0x26e0 kernel/locking/mutex.c:603 __mutex_lock kernel/locking/mutex.c:747 [inline] mutex_lock_nested+0x17/0x20 kernel/locking/mutex.c:799 ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline] attr_data_get_block+0x301/0x2370 fs/ntfs3/attrib.c:917 ntfs_file_mmap+0x48c/0x730 fs/ntfs3/file.c:387 call_mmap include/linux/fs.h:2204 [inline] mmap_region+0xfe6/0x1e20 mm/mmap.c:2622 do_mmap+0x8d9/0xf30 mm/mmap.c:1412 vm_mmap_pgoff+0x19e/0x2b0 mm/util.c:520 ksys_mmap_pgoff+0x48c/0x6d0 mm/mmap.c:1458 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (&mm->mmap_lock#2){++++}-{3:3}: check_prev_add kernel/locking/lockdep.c:3097 [inline] check_prevs_add kernel/locking/lockdep.c:3216 [inline] validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831 __lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055 lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668 __might_fault+0xb2/0x110 mm/memory.c:5645 _copy_to_user+0x26/0x130 lib/usercopy.c:29 copy_to_user include/linux/uaccess.h:169 [inline] fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144 ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934 ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245 ioctl_fiemap fs/ioctl.c:219 [inline] do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810 __do_sys_ioctl fs/ioctl.c:868 [inline] __se_sys_ioctl+0x83/0x170 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&ni->ni_lock/4); lock(&mm->mmap_lock#2); lock(&ni->ni_lock/4); lock(&mm->mmap_lock#2); *** DEADLOCK *** 1 lock held by syz-executor145/3632: #0: ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline] #0: ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243 stack backtrace: CPU: 0 PID: 3632 Comm: syz-executor145 Not tainted 6.1.0-rc8-syzkaller-00035-gf3e8416619ce #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106 check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2177 check_prev_add kernel/locking/lockdep.c:3097 [inline] check_prevs_add kernel/locking/lockdep.c:3216 [inline] validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831 __lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055 lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668 __might_fault+0xb2/0x110 mm/memory.c:5645 _copy_to_user+0x26/0x130 lib/usercopy.c:29 copy_to_user include/linux/uaccess.h:169 [inline] fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144 ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934 ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245 ioctl_fiemap fs/ioctl.c:219 [inline] do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810 __do_sys_ioctl fs/ioctl.c:868 [inline] __se_sys_ioctl+0x83/0x170 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f84c755aca9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffdcb203e18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f84c755aca9 RDX: 0000000020000040 RSI: 00000000c020660b RDI: 0000000000000006 RBP: 00007f84c751a2b0 R08: 0000000000000000 R09: 0000000000000000 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() 2022-12-09 4:00 ` syzbot @ 2023-04-12 13:11 ` Tetsuo Handa 2023-04-12 13:13 ` Matthew Wilcox 2023-04-17 14:03 ` [PATCH] vfs: allow using kernel buffer during fiemap operation Tetsuo Handa 0 siblings, 2 replies; 9+ messages in thread From: Tetsuo Handa @ 2023-04-12 13:11 UTC (permalink / raw) To: syzbot, ntfs3, syzkaller-bugs, Konstantin Komarov Cc: Hillf Danton, linux-fsdevel, linux-mm, trix, ndesaulniers, nathan syzbot is reporting circular locking dependency between ntfs_file_mmap() (which has mm->mmap_lock => ni->ni_lock dependency) and ntfs_fiemap() (which has ni->ni_lock => mm->mmap_lock dependency). Since ni_fiemap() is called by ioctl(FS_IOC_FIEMAP) via optional "struct inode_operations"->fiemap callback, I assume that importance of ni_fiemap() is lower than ntfs_file_mmap(). Also, since Documentation/filesystems/fiemap.rst says that "If an error is encountered while copying the extent to user memory, -EFAULT will be returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT error. Therefore, in order to eliminate possibility of deadlock, until Assumed ni_lock. TODO: Less aggressive locks. comment in ni_fiemap() is removed, use ni_fiemap() with best-effort basis (i.e. fail with -EFAULT when a page fault is inevitable). Reported-by: syzbot <syzbot+96cee7d33ca3f87eee86@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86 Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- fs/ntfs3/file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c index e9bdc1ff08c9..a9e7204e1579 100644 --- a/fs/ntfs3/file.c +++ b/fs/ntfs3/file.c @@ -1146,9 +1146,11 @@ int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, return err; ni_lock(ni); + pagefault_disable(); err = ni_fiemap(ni, fieinfo, start, len); + pagefault_enable(); ni_unlock(ni); return err; -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() 2023-04-12 13:11 ` [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() Tetsuo Handa @ 2023-04-12 13:13 ` Matthew Wilcox 2023-04-12 13:29 ` Tetsuo Handa 2023-04-17 14:03 ` [PATCH] vfs: allow using kernel buffer during fiemap operation Tetsuo Handa 1 sibling, 1 reply; 9+ messages in thread From: Matthew Wilcox @ 2023-04-12 13:13 UTC (permalink / raw) To: Tetsuo Handa Cc: syzbot, ntfs3, syzkaller-bugs, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, trix, ndesaulniers, nathan On Wed, Apr 12, 2023 at 10:11:08PM +0900, Tetsuo Handa wrote: > syzbot is reporting circular locking dependency between ntfs_file_mmap() > (which has mm->mmap_lock => ni->ni_lock dependency) and ntfs_fiemap() > (which has ni->ni_lock => mm->mmap_lock dependency). > > Since ni_fiemap() is called by ioctl(FS_IOC_FIEMAP) via optional > "struct inode_operations"->fiemap callback, I assume that importance of > ni_fiemap() is lower than ntfs_file_mmap(). > > Also, since Documentation/filesystems/fiemap.rst says that "If an error > is encountered while copying the extent to user memory, -EFAULT will be > returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT > error. What? No, that doesn't mean "You can return -EFAULT because random luck". That means "If you pass it an invalid address, you'll get -EFAULT back". NACK. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() 2023-04-12 13:13 ` Matthew Wilcox @ 2023-04-12 13:29 ` Tetsuo Handa 2023-04-12 14:24 ` Matthew Wilcox 0 siblings, 1 reply; 9+ messages in thread From: Tetsuo Handa @ 2023-04-12 13:29 UTC (permalink / raw) To: Matthew Wilcox Cc: syzbot, ntfs3, syzkaller-bugs, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, trix, ndesaulniers, nathan On 2023/04/12 22:13, Matthew Wilcox wrote: >> Also, since Documentation/filesystems/fiemap.rst says that "If an error >> is encountered while copying the extent to user memory, -EFAULT will be >> returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT >> error. > > What? No, that doesn't mean "You can return -EFAULT because random luck". > That means "If you pass it an invalid address, you'll get -EFAULT back". > > NACK. Then, why does fiemap.rst say "If an error is encountered" rather than "If an invalid address is passed" ? Does the definition of -EFAULT limited to "the caller does not have permission to access this address" ? ---------- int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical, u64 phys, u64 len, u32 flags) { struct fiemap_extent extent; struct fiemap_extent __user *dest = fieinfo->fi_extents_start; /* only count the extents */ if (fieinfo->fi_extents_max == 0) { fieinfo->fi_extents_mapped++; return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0; } if (fieinfo->fi_extents_mapped >= fieinfo->fi_extents_max) return 1; if (flags & SET_UNKNOWN_FLAGS) flags |= FIEMAP_EXTENT_UNKNOWN; if (flags & SET_NO_UNMOUNTED_IO_FLAGS) flags |= FIEMAP_EXTENT_ENCODED; if (flags & SET_NOT_ALIGNED_FLAGS) flags |= FIEMAP_EXTENT_NOT_ALIGNED; memset(&extent, 0, sizeof(extent)); extent.fe_logical = logical; extent.fe_physical = phys; extent.fe_length = len; extent.fe_flags = flags; dest += fieinfo->fi_extents_mapped; if (copy_to_user(dest, &extent, sizeof(extent))) return -EFAULT; fieinfo->fi_extents_mapped++; if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max) return 1; return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0; } ---------- If copy_to_user() must not fail other than "the caller does not have permission to access this address", what should we do for now? Just remove ntfs_fiemap() and return -EOPNOTSUPP ? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() 2023-04-12 13:29 ` Tetsuo Handa @ 2023-04-12 14:24 ` Matthew Wilcox 0 siblings, 0 replies; 9+ messages in thread From: Matthew Wilcox @ 2023-04-12 14:24 UTC (permalink / raw) To: Tetsuo Handa Cc: syzbot, ntfs3, syzkaller-bugs, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, trix, ndesaulniers, nathan On Wed, Apr 12, 2023 at 10:29:37PM +0900, Tetsuo Handa wrote: > On 2023/04/12 22:13, Matthew Wilcox wrote: > >> Also, since Documentation/filesystems/fiemap.rst says that "If an error > >> is encountered while copying the extent to user memory, -EFAULT will be > >> returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT > >> error. > > > > What? No, that doesn't mean "You can return -EFAULT because random luck". > > That means "If you pass it an invalid address, you'll get -EFAULT back". > > > > NACK. > > Then, why does fiemap.rst say "If an error is encountered" rather than > "If an invalid address is passed" ? Because people are bad at writing. > Does the definition of -EFAULT limited to "the caller does not have permission > to access this address" ? Or the address isn't mapped. > If copy_to_user() must not fail other than "the caller does not have > permission to access this address", what should we do for now? > Just remove ntfs_fiemap() and return -EOPNOTSUPP ? No, fix it properly. Or don't fix it at all and let somebody else fix it. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH] vfs: allow using kernel buffer during fiemap operation 2023-04-12 13:11 ` [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() Tetsuo Handa 2023-04-12 13:13 ` Matthew Wilcox @ 2023-04-17 14:03 ` Tetsuo Handa 2023-04-20 21:00 ` Al Viro 1 sibling, 1 reply; 9+ messages in thread From: Tetsuo Handa @ 2023-04-17 14:03 UTC (permalink / raw) To: syzbot, ntfs3, syzkaller-bugs, Mark Fasheh, Alexander Viro, Christian Brauner, Konstantin Komarov, linux-fsdevel Cc: Hillf Danton, linux-mm, trix, ndesaulniers, nathan syzbot is reporting circular locking dependency between ntfs_file_mmap() (which has mm->mmap_lock => ni->ni_lock => ni->file.run_lock dependency) and ntfs_fiemap() (which has ni->ni_lock => ni->file.run_lock => mm->mmap_lock dependency), for commit c4b929b85bdb ("vfs: vfs-level fiemap interface") implemented fiemap_fill_next_extent() using copy_to_user() where direct mm->mmap_lock dependency is inevitable. Since ntfs3 does not want to release ni->ni_lock and/or ni->file.run_lock in order to make sure that "struct ATTRIB" does not change during ioctl(FS_IOC_FIEMAP) request, let's make it possible to call fiemap_fill_next_extent() with filesystem locks held. This patch adds fiemap_fill_next_kernel_extent() which spools "struct fiemap_extent" to dynamically allocated kernel buffer, and fiemap_copy_kernel_extent() which copies spooled "struct fiemap_extent" to userspace buffer after filesystem locks are released. Reported-by: syzbot <syzbot+96cee7d33ca3f87eee86@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86 Reported-by: syzbot <syzbot+c300ab283ba3bc072439@syzkaller.appspotmail.com> Link: https://syzkaller.appspot.com/bug?extid=c300ab283ba3bc072439 Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> --- fs/ioctl.c | 52 ++++++++++++++++++++++++++++++++++++------ fs/ntfs3/file.c | 4 ++++ fs/ntfs3/frecord.c | 10 ++++---- include/linux/fiemap.h | 24 +++++++++++++++++-- 4 files changed, 76 insertions(+), 14 deletions(-) diff --git a/fs/ioctl.c b/fs/ioctl.c index 5b2481cd4750..60ddc2760932 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -112,11 +112,10 @@ static int ioctl_fibmap(struct file *filp, int __user *p) #define SET_UNKNOWN_FLAGS (FIEMAP_EXTENT_DELALLOC) #define SET_NO_UNMOUNTED_IO_FLAGS (FIEMAP_EXTENT_DATA_ENCRYPTED) #define SET_NOT_ALIGNED_FLAGS (FIEMAP_EXTENT_DATA_TAIL|FIEMAP_EXTENT_DATA_INLINE) -int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical, - u64 phys, u64 len, u32 flags) +int do_fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical, + u64 phys, u64 len, u32 flags, bool is_kernel) { struct fiemap_extent extent; - struct fiemap_extent __user *dest = fieinfo->fi_extents_start; /* only count the extents */ if (fieinfo->fi_extents_max == 0) { @@ -140,16 +139,55 @@ int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical, extent.fe_length = len; extent.fe_flags = flags; - dest += fieinfo->fi_extents_mapped; - if (copy_to_user(dest, &extent, sizeof(extent))) - return -EFAULT; + if (!is_kernel) { + struct fiemap_extent __user *dest = fieinfo->fi_extents_start; + + dest += fieinfo->fi_extents_mapped; + if (copy_to_user(dest, &extent, sizeof(extent))) + return -EFAULT; + } else { + struct fiemap_extent_list *entry = kmalloc(sizeof(*entry), GFP_NOFS); + + if (!entry) + return -ENOMEM; + memmove(&entry->extent, &extent, sizeof(extent)); + list_add_tail(&entry->list, &fieinfo->fi_extents_list); + } fieinfo->fi_extents_mapped++; if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max) return 1; return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0; } -EXPORT_SYMBOL(fiemap_fill_next_extent); +EXPORT_SYMBOL(do_fiemap_fill_next_extent); + +int fiemap_copy_kernel_extent(struct fiemap_extent_info *fieinfo, int err) +{ + struct fiemap_extent __user *dest; + struct fiemap_extent_list *entry, *tmp; + unsigned int len = 0; + + list_for_each_entry(entry, &fieinfo->fi_extents_list, list) + len++; + if (!len) + return err; + fieinfo->fi_extents_mapped -= len; + dest = fieinfo->fi_extents_start + fieinfo->fi_extents_mapped; + list_for_each_entry(entry, &fieinfo->fi_extents_list, list) { + if (copy_to_user(dest, &entry->extent, sizeof(entry->extent))) { + err = -EFAULT; + break; + } + dest++; + fieinfo->fi_extents_mapped++; + } + list_for_each_entry_safe(entry, tmp, &fieinfo->fi_extents_list, list) { + list_del(&entry->list); + kfree(entry); + } + return err; +} +EXPORT_SYMBOL(fiemap_copy_kernel_extent); /** * fiemap_prep - check validity of requested flags for fiemap diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c index e9bdc1ff08c9..1a3e28f71599 100644 --- a/fs/ntfs3/file.c +++ b/fs/ntfs3/file.c @@ -1145,12 +1145,16 @@ int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo, if (err) return err; + INIT_LIST_HEAD(&fieinfo->fi_extents_list); + ni_lock(ni); err = ni_fiemap(ni, fieinfo, start, len); ni_unlock(ni); + err = fiemap_copy_kernel_extent(fieinfo, err); + return err; } diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c index f1df52dfab74..b70f9dfb71ab 100644 --- a/fs/ntfs3/frecord.c +++ b/fs/ntfs3/frecord.c @@ -1941,8 +1941,7 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, } if (!attr || !attr->non_res) { - err = fiemap_fill_next_extent( - fieinfo, 0, 0, + err = fiemap_fill_next_kernel_extent(fieinfo, 0, 0, attr ? le32_to_cpu(attr->res.data_size) : 0, FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_LAST | FIEMAP_EXTENT_MERGED); @@ -2037,8 +2036,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, if (vbo + dlen >= end) flags |= FIEMAP_EXTENT_LAST; - err = fiemap_fill_next_extent(fieinfo, vbo, lbo, dlen, - flags); + err = fiemap_fill_next_kernel_extent(fieinfo, vbo, lbo, + dlen, flags); if (err < 0) break; if (err == 1) { @@ -2058,7 +2057,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo, if (vbo + bytes >= end) flags |= FIEMAP_EXTENT_LAST; - err = fiemap_fill_next_extent(fieinfo, vbo, lbo, bytes, flags); + err = fiemap_fill_next_kernel_extent(fieinfo, vbo, lbo, bytes, + flags); if (err < 0) break; if (err == 1) { diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h index c50882f19235..10cb33ed80a9 100644 --- a/include/linux/fiemap.h +++ b/include/linux/fiemap.h @@ -5,17 +5,37 @@ #include <uapi/linux/fiemap.h> #include <linux/fs.h> +struct fiemap_extent_list { + struct list_head list; + struct fiemap_extent extent; +}; + struct fiemap_extent_info { unsigned int fi_flags; /* Flags as passed from user */ unsigned int fi_extents_mapped; /* Number of mapped extents */ unsigned int fi_extents_max; /* Size of fiemap_extent array */ struct fiemap_extent __user *fi_extents_start; /* Start of fiemap_extent array */ + struct list_head fi_extents_list; /* List of fiemap_extent_list */ }; int fiemap_prep(struct inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 *len, u32 supported_flags); -int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical, - u64 phys, u64 len, u32 flags); +int do_fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical, + u64 phys, u64 len, u32 flags, bool is_kernel); + +static inline int fiemap_fill_next_extent(struct fiemap_extent_info *info, + u64 logical, u64 phys, u64 len, u32 flags) +{ + return do_fiemap_fill_next_extent(info, logical, phys, len, flags, false); +} + +static inline int fiemap_fill_next_kernel_extent(struct fiemap_extent_info *info, + u64 logical, u64 phys, u64 len, u32 flags) +{ + return do_fiemap_fill_next_extent(info, logical, phys, len, flags, true); +} + +int fiemap_copy_kernel_extent(struct fiemap_extent_info *info, int err); #endif /* _LINUX_FIEMAP_H 1 */ -- 2.34.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] vfs: allow using kernel buffer during fiemap operation 2023-04-17 14:03 ` [PATCH] vfs: allow using kernel buffer during fiemap operation Tetsuo Handa @ 2023-04-20 21:00 ` Al Viro 2023-04-20 21:11 ` Al Viro 0 siblings, 1 reply; 9+ messages in thread From: Al Viro @ 2023-04-20 21:00 UTC (permalink / raw) To: Tetsuo Handa Cc: syzbot, ntfs3, syzkaller-bugs, Mark Fasheh, Christian Brauner, Konstantin Komarov, linux-fsdevel, Hillf Danton, linux-mm, trix, ndesaulniers, nathan On Mon, Apr 17, 2023 at 11:03:26PM +0900, Tetsuo Handa wrote: > syzbot is reporting circular locking dependency between ntfs_file_mmap() > (which has mm->mmap_lock => ni->ni_lock => ni->file.run_lock dependency) > and ntfs_fiemap() (which has ni->ni_lock => ni->file.run_lock => > mm->mmap_lock dependency), for commit c4b929b85bdb ("vfs: vfs-level fiemap > interface") implemented fiemap_fill_next_extent() using copy_to_user() > where direct mm->mmap_lock dependency is inevitable. > > Since ntfs3 does not want to release ni->ni_lock and/or ni->file.run_lock > in order to make sure that "struct ATTRIB" does not change during > ioctl(FS_IOC_FIEMAP) request, let's make it possible to call > fiemap_fill_next_extent() with filesystem locks held. > > This patch adds fiemap_fill_next_kernel_extent() which spools > "struct fiemap_extent" to dynamically allocated kernel buffer, and > fiemap_copy_kernel_extent() which copies spooled "struct fiemap_extent" > to userspace buffer after filesystem locks are released. Ugh... I'm pretty certain that this is a wrong approach. What is going on in ntfs_file_mmap() that requires that kind of locking? AFAICS, that's the part that looks odd... Details, please. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] vfs: allow using kernel buffer during fiemap operation 2023-04-20 21:00 ` Al Viro @ 2023-04-20 21:11 ` Al Viro 0 siblings, 0 replies; 9+ messages in thread From: Al Viro @ 2023-04-20 21:11 UTC (permalink / raw) To: Tetsuo Handa Cc: syzbot, ntfs3, syzkaller-bugs, Mark Fasheh, Christian Brauner, Konstantin Komarov, linux-fsdevel, Hillf Danton, linux-mm, trix, ndesaulniers, nathan On Thu, Apr 20, 2023 at 10:00:45PM +0100, Al Viro wrote: > On Mon, Apr 17, 2023 at 11:03:26PM +0900, Tetsuo Handa wrote: > > syzbot is reporting circular locking dependency between ntfs_file_mmap() > > (which has mm->mmap_lock => ni->ni_lock => ni->file.run_lock dependency) > > and ntfs_fiemap() (which has ni->ni_lock => ni->file.run_lock => > > mm->mmap_lock dependency), for commit c4b929b85bdb ("vfs: vfs-level fiemap > > interface") implemented fiemap_fill_next_extent() using copy_to_user() > > where direct mm->mmap_lock dependency is inevitable. > > > > Since ntfs3 does not want to release ni->ni_lock and/or ni->file.run_lock > > in order to make sure that "struct ATTRIB" does not change during > > ioctl(FS_IOC_FIEMAP) request, let's make it possible to call > > fiemap_fill_next_extent() with filesystem locks held. > > > > This patch adds fiemap_fill_next_kernel_extent() which spools > > "struct fiemap_extent" to dynamically allocated kernel buffer, and > > fiemap_copy_kernel_extent() which copies spooled "struct fiemap_extent" > > to userspace buffer after filesystem locks are released. > > Ugh... I'm pretty certain that this is a wrong approach. What is going > on in ntfs_file_mmap() that requires that kind of locking? > > AFAICS, that's the part that looks odd... Details, please. if (ni->i_valid < to) { if (!inode_trylock(inode)) { err = -EAGAIN; goto out; } err = ntfs_extend_initialized_size(file, ni, ni->i_valid, to); inode_unlock(inode); if (err) goto out; } See that inode_trylock() there? That's another sign of the same problem; it's just that their internal locks (taken by ntfs_extend_initialized_size()) are taken without the same kind of papering over the problem. 'to' here is guaranteed to be under the i_size_read(inode); is that some kind of delayed allocation? Or lazy extending truncate(), perhaps? I'm not familiar with ntfs codebase (or ntfs layout, for that matter), so could somebody describe what exactly is going on in that code? Frankly, my impression is that this stuff is done in the wrong place... ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-04-20 21:14 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-11-30 12:51 [syzbot] possible deadlock in ntfs_fiemap syzbot 2022-12-09 4:00 ` syzbot 2023-04-12 13:11 ` [PATCH] fs/ntfs3: disable page fault during ntfs_fiemap() Tetsuo Handa 2023-04-12 13:13 ` Matthew Wilcox 2023-04-12 13:29 ` Tetsuo Handa 2023-04-12 14:24 ` Matthew Wilcox 2023-04-17 14:03 ` [PATCH] vfs: allow using kernel buffer during fiemap operation Tetsuo Handa 2023-04-20 21:00 ` Al Viro 2023-04-20 21:11 ` Al Viro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox