* Re: [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) [not found] ` <CACT4Y+a=xWkNGw_iKibRp4ivSE8OJkWWT0VPQ4N4d1+vj0FMdg@mail.gmail.com> @ 2023-05-01 5:15 ` Tetsuo Handa 2023-05-01 14:05 ` Tetsuo Handa 2023-05-02 6:13 ` Dmitry Vyukov 0 siblings, 2 replies; 5+ messages in thread From: Tetsuo Handa @ 2023-05-01 5:15 UTC (permalink / raw) To: linux-fsdevel, Alexander Viro Cc: akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs, syzbot, Dmitry Vyukov On 2023/04/24 17:26, Dmitry Vyukov wrote: >> HEAD commit: 457391b03803 Linux 6.3 >> git tree: upstream >> console output: https://syzkaller.appspot.com/x/log.txt?x=13226cf0280000 >> kernel config: https://syzkaller.appspot.com/x/.config?x=8c81c9a3d360ebcf >> dashboard link: https://syzkaller.appspot.com/bug?extid=702361cf7e3d95758761 >> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2 > > I think shmem_mknod() needs to use i_size_write() to update the size. > Writes to i_size are not assumed to be atomic throughout the kernel > code. > I don't think that using i_size_{read,write}() alone is sufficient, for I think that i_size_{read,write}() needs data_race() annotation. include/linux/fs.h | 13 +++++++++++-- mm/shmem.c | 12 ++++++------ 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 21a981680856..0d067bbe3ee9 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -860,6 +860,13 @@ void filemap_invalidate_unlock_two(struct address_space *mapping1, * the read or for example on x86 they can be still implemented as a * cmpxchg8b without the need of the lock prefix). For SMP compiles * and 64bit archs it makes no difference if preempt is enabled or not. + * + * However, when KCSAN is enabled, CPU being capable of reading/updating + * naturally aligned 8 bytes of memory atomically is not sufficient for + * avoiding KCSAN warning, for KCSAN checks whether value has changed between + * before and after of a read operation. But since we don't want to introduce + * seqcount overhead only for suppressing KCSAN warning, tell KCSAN that data + * race on accessing i_size field is acceptable. */ static inline loff_t i_size_read(const struct inode *inode) { @@ -880,7 +887,8 @@ static inline loff_t i_size_read(const struct inode *inode) preempt_enable(); return i_size; #else - return inode->i_size; + /* See comment above. */ + return data_race(inode->i_size); #endif } @@ -902,7 +910,8 @@ static inline void i_size_write(struct inode *inode, loff_t i_size) inode->i_size = i_size; preempt_enable(); #else - inode->i_size = i_size; + /* See comment above. */ + data_race(inode->i_size = i_size); #endif } diff --git a/mm/shmem.c b/mm/shmem.c index e40a08c5c6d7..a2f20297fb59 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2951,7 +2951,7 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, goto out_iput; error = 0; - dir->i_size += BOGO_DIRENT_SIZE; + i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE); dir->i_ctime = dir->i_mtime = current_time(dir); inode_inc_iversion(dir); d_instantiate(dentry, inode); @@ -3027,7 +3027,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr goto out; } - dir->i_size += BOGO_DIRENT_SIZE; + i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE); inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); inode_inc_iversion(dir); inc_nlink(inode); @@ -3045,7 +3045,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode)) shmem_free_inode(inode->i_sb); - dir->i_size -= BOGO_DIRENT_SIZE; + i_size_write(dir, i_size_read(dir) - BOGO_DIRENT_SIZE); inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); inode_inc_iversion(dir); drop_nlink(inode); @@ -3132,8 +3132,8 @@ static int shmem_rename2(struct mnt_idmap *idmap, inc_nlink(new_dir); } - old_dir->i_size -= BOGO_DIRENT_SIZE; - new_dir->i_size += BOGO_DIRENT_SIZE; + i_size_write(old_dir, i_size_read(old_dir) - BOGO_DIRENT_SIZE); + i_size_write(new_dir, i_size_read(new_dir) + BOGO_DIRENT_SIZE); old_dir->i_ctime = old_dir->i_mtime = new_dir->i_ctime = new_dir->i_mtime = inode->i_ctime = current_time(old_dir); @@ -3189,7 +3189,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, folio_unlock(folio); folio_put(folio); } - dir->i_size += BOGO_DIRENT_SIZE; + i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE); dir->i_ctime = dir->i_mtime = current_time(dir); inode_inc_iversion(dir); d_instantiate(dentry, inode); Maybe we want i_size_add() ? Also, there was a similar report on updating i_{ctime,mtime} to current_time() which means that i_size is not the only field that is causing data race. https://syzkaller.appspot.com/bug?id=067d40ab9ab23a6fa0a8156857ed54e295062a29 Hmm, where is the serialization that avoids concurrent shmem_mknod()/shmem_mknod() or shmem_mknod()/shmem_unlink() ? i_size_write() says "need locking around it (normally i_mutex)"... ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) 2023-05-01 5:15 ` [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) Tetsuo Handa @ 2023-05-01 14:05 ` Tetsuo Handa 2023-05-02 10:13 ` Tetsuo Handa 2023-05-02 6:13 ` Dmitry Vyukov 1 sibling, 1 reply; 5+ messages in thread From: Tetsuo Handa @ 2023-05-01 14:05 UTC (permalink / raw) To: linux-fsdevel, Alexander Viro Cc: akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs, syzbot, Dmitry Vyukov On 2023/05/01 14:15, Tetsuo Handa wrote: > Hmm, where is the serialization that avoids concurrent > shmem_mknod()/shmem_mknod() or shmem_mknod()/shmem_unlink() ? > i_size_write() says "need locking around it (normally i_mutex)"... > Since filename_create() calls inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT) and done_path_create() calls inode_unlock(path->dentry->d_inode), serialization looks OK. Just the name is no longer i_mutex ? > Also, there was a similar report on updating i_{ctime,mtime} to current_time() > which means that i_size is not the only field that is causing data race. > https://syzkaller.appspot.com/bug?id=067d40ab9ab23a6fa0a8156857ed54e295062a29 Do we want to as well wrap i_{ctime,mtime} using data_race() ? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) 2023-05-01 14:05 ` Tetsuo Handa @ 2023-05-02 10:13 ` Tetsuo Handa 0 siblings, 0 replies; 5+ messages in thread From: Tetsuo Handa @ 2023-05-02 10:13 UTC (permalink / raw) To: linux-fsdevel, Alexander Viro Cc: akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs, syzbot, Dmitry Vyukov On 2023/05/01 23:05, Tetsuo Handa wrote: >> Also, there was a similar report on updating i_{ctime,mtime} to current_time() >> which means that i_size is not the only field that is causing data race. >> https://syzkaller.appspot.com/bug?id=067d40ab9ab23a6fa0a8156857ed54e295062a29 > > Do we want to as well wrap i_{ctime,mtime} using data_race() ? > I think we need to use inode_lock_shared()/inode_unlock_shared() when calling generic_fillattr(), for i_{ctime,mtime} (128bits) are too large to copy atomically. Is it safe to call inode_lock_shared()/inode_unlock_shared() from generic_fillattr()? Is some filesystem already holding inode lock before calling generic_fillattr()? ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) 2023-05-01 5:15 ` [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) Tetsuo Handa 2023-05-01 14:05 ` Tetsuo Handa @ 2023-05-02 6:13 ` Dmitry Vyukov 1 sibling, 0 replies; 5+ messages in thread From: Dmitry Vyukov @ 2023-05-02 6:13 UTC (permalink / raw) To: Tetsuo Handa Cc: linux-fsdevel, Alexander Viro, akpm, hughd, linux-kernel, linux-mm, syzkaller-bugs, syzbot On Mon, 1 May 2023 at 07:16, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > > On 2023/04/24 17:26, Dmitry Vyukov wrote: > >> HEAD commit: 457391b03803 Linux 6.3 > >> git tree: upstream > >> console output: https://syzkaller.appspot.com/x/log.txt?x=13226cf0280000 > >> kernel config: https://syzkaller.appspot.com/x/.config?x=8c81c9a3d360ebcf > >> dashboard link: https://syzkaller.appspot.com/bug?extid=702361cf7e3d95758761 > >> compiler: Debian clang version 15.0.7, GNU ld (GNU Binutils for Debian) 2.35.2 > > > > I think shmem_mknod() needs to use i_size_write() to update the size. > > Writes to i_size are not assumed to be atomic throughout the kernel > > code. > > > > I don't think that using i_size_{read,write}() alone is sufficient, > for I think that i_size_{read,write}() needs data_race() annotation. Agree. Or better proper READ/WRITE_ONCE. data_race() is just an annotation, it does not fix the actual data race bug that is present there. I see there are lots of uses of i_size_read() in complex scenarios that involve comparisons of the size. All such racy uses are subject to the TOCTOU bug at least. > include/linux/fs.h | 13 +++++++++++-- > mm/shmem.c | 12 ++++++------ > 2 files changed, 17 insertions(+), 8 deletions(-) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 21a981680856..0d067bbe3ee9 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -860,6 +860,13 @@ void filemap_invalidate_unlock_two(struct address_space *mapping1, > * the read or for example on x86 they can be still implemented as a > * cmpxchg8b without the need of the lock prefix). For SMP compiles > * and 64bit archs it makes no difference if preempt is enabled or not. > + * > + * However, when KCSAN is enabled, CPU being capable of reading/updating > + * naturally aligned 8 bytes of memory atomically is not sufficient for > + * avoiding KCSAN warning, for KCSAN checks whether value has changed between > + * before and after of a read operation. But since we don't want to introduce > + * seqcount overhead only for suppressing KCSAN warning, tell KCSAN that data > + * race on accessing i_size field is acceptable. > */ > static inline loff_t i_size_read(const struct inode *inode) > { > @@ -880,7 +887,8 @@ static inline loff_t i_size_read(const struct inode *inode) > preempt_enable(); > return i_size; > #else > - return inode->i_size; > + /* See comment above. */ > + return data_race(inode->i_size); > #endif > } > > @@ -902,7 +910,8 @@ static inline void i_size_write(struct inode *inode, loff_t i_size) > inode->i_size = i_size; > preempt_enable(); > #else > - inode->i_size = i_size; > + /* See comment above. */ > + data_race(inode->i_size = i_size); > #endif > } > > diff --git a/mm/shmem.c b/mm/shmem.c > index e40a08c5c6d7..a2f20297fb59 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -2951,7 +2951,7 @@ shmem_mknod(struct mnt_idmap *idmap, struct inode *dir, > goto out_iput; > > error = 0; > - dir->i_size += BOGO_DIRENT_SIZE; > + i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE); > dir->i_ctime = dir->i_mtime = current_time(dir); > inode_inc_iversion(dir); > d_instantiate(dentry, inode); > @@ -3027,7 +3027,7 @@ static int shmem_link(struct dentry *old_dentry, struct inode *dir, struct dentr > goto out; > } > > - dir->i_size += BOGO_DIRENT_SIZE; > + i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE); > inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); > inode_inc_iversion(dir); > inc_nlink(inode); > @@ -3045,7 +3045,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) > if (inode->i_nlink > 1 && !S_ISDIR(inode->i_mode)) > shmem_free_inode(inode->i_sb); > > - dir->i_size -= BOGO_DIRENT_SIZE; > + i_size_write(dir, i_size_read(dir) - BOGO_DIRENT_SIZE); > inode->i_ctime = dir->i_ctime = dir->i_mtime = current_time(inode); > inode_inc_iversion(dir); > drop_nlink(inode); > @@ -3132,8 +3132,8 @@ static int shmem_rename2(struct mnt_idmap *idmap, > inc_nlink(new_dir); > } > > - old_dir->i_size -= BOGO_DIRENT_SIZE; > - new_dir->i_size += BOGO_DIRENT_SIZE; > + i_size_write(old_dir, i_size_read(old_dir) - BOGO_DIRENT_SIZE); > + i_size_write(new_dir, i_size_read(new_dir) + BOGO_DIRENT_SIZE); > old_dir->i_ctime = old_dir->i_mtime = > new_dir->i_ctime = new_dir->i_mtime = > inode->i_ctime = current_time(old_dir); > @@ -3189,7 +3189,7 @@ static int shmem_symlink(struct mnt_idmap *idmap, struct inode *dir, > folio_unlock(folio); > folio_put(folio); > } > - dir->i_size += BOGO_DIRENT_SIZE; > + i_size_write(dir, i_size_read(dir) + BOGO_DIRENT_SIZE); > dir->i_ctime = dir->i_mtime = current_time(dir); > inode_inc_iversion(dir); > d_instantiate(dentry, inode); > > Maybe we want i_size_add() ? > > Also, there was a similar report on updating i_{ctime,mtime} to current_time() > which means that i_size is not the only field that is causing data race. > https://syzkaller.appspot.com/bug?id=067d40ab9ab23a6fa0a8156857ed54e295062a29 > > Hmm, where is the serialization that avoids concurrent > shmem_mknod()/shmem_mknod() or shmem_mknod()/shmem_unlink() ? > i_size_write() says "need locking around it (normally i_mutex)"... > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) [not found] <0000000000007337c705fa1060e2@google.com> [not found] ` <CACT4Y+a=xWkNGw_iKibRp4ivSE8OJkWWT0VPQ4N4d1+vj0FMdg@mail.gmail.com> @ 2024-01-12 12:15 ` syzbot 1 sibling, 0 replies; 5+ messages in thread From: syzbot @ 2024-01-12 12:15 UTC (permalink / raw) To: akpm, dvyukov, hughd, linux-fsdevel, linux-kernel, linux-mm, penguin-kernel, syzkaller-bugs, viro syzbot has found a reproducer for the following issue on: HEAD commit: 70d201a40823 Merge tag 'f2fs-for-6.8-rc1' of git://git.ker.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16e391f5e80000 kernel config: https://syzkaller.appspot.com/x/.config?x=31b069fcee8f481d dashboard link: https://syzkaller.appspot.com/bug?extid=702361cf7e3d95758761 compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=147a56a3e80000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/4446464b507c/disk-70d201a4.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/578f39e16cac/vmlinux-70d201a4.xz kernel image: https://storage.googleapis.com/syzbot-assets/5fffd404e095/bzImage-70d201a4.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+702361cf7e3d95758761@syzkaller.appspotmail.com ================================================================== BUG: KCSAN: data-race in generic_fillattr / shmem_mknod write to 0xffff88810427aa10 of 8 bytes by task 3467 on cpu 1: inode_set_mtime_to_ts include/linux/fs.h:1571 [inline] shmem_mknod+0x132/0x180 mm/shmem.c:3259 shmem_create+0x34/0x40 mm/shmem.c:3313 lookup_open fs/namei.c:3486 [inline] open_last_lookups fs/namei.c:3555 [inline] path_openat+0xdc2/0x1d30 fs/namei.c:3785 do_filp_open+0xf6/0x200 fs/namei.c:3815 do_sys_openat2+0xab/0x110 fs/open.c:1404 do_sys_open fs/open.c:1419 [inline] __do_sys_openat fs/open.c:1435 [inline] __se_sys_openat fs/open.c:1430 [inline] __x64_sys_openat+0xf3/0x120 fs/open.c:1430 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x59/0x120 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b read to 0xffff88810427aa10 of 8 bytes by task 3068 on cpu 0: inode_get_mtime include/linux/fs.h:1565 [inline] generic_fillattr+0x1a6/0x2f0 fs/stat.c:61 shmem_getattr+0x17b/0x200 mm/shmem.c:1139 vfs_getattr_nosec fs/stat.c:135 [inline] vfs_getattr+0x198/0x1e0 fs/stat.c:176 vfs_statx+0x140/0x320 fs/stat.c:248 vfs_fstatat+0xcd/0x100 fs/stat.c:304 __do_sys_newfstatat fs/stat.c:468 [inline] __se_sys_newfstatat+0x58/0x260 fs/stat.c:462 __x64_sys_newfstatat+0x55/0x60 fs/stat.c:462 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x59/0x120 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x63/0x6b value changed: 0x00000000366f1c62 -> 0x000000003707b2e3 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 3068 Comm: udevd Not tainted 6.7.0-syzkaller-06264-g70d201a40823 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 ================================================================== --- If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-01-12 12:15 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <0000000000007337c705fa1060e2@google.com> [not found] ` <CACT4Y+a=xWkNGw_iKibRp4ivSE8OJkWWT0VPQ4N4d1+vj0FMdg@mail.gmail.com> 2023-05-01 5:15 ` [syzbot] [mm?] KCSAN: data-race in generic_fillattr / shmem_mknod (2) Tetsuo Handa 2023-05-01 14:05 ` Tetsuo Handa 2023-05-02 10:13 ` Tetsuo Handa 2023-05-02 6:13 ` Dmitry Vyukov 2024-01-12 12:15 ` syzbot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).