From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: syzbot <syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
syzkaller-bugs <syzkaller-bugs@googlegroups.com>
Subject: Re: KASAN: use-after-free Read in path_lookupat
Date: Mon, 25 Mar 2019 23:37:32 +0000 [thread overview]
Message-ID: <20190325233731.GS2217@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CAHk-=wj+-e=X9cvvNwc5+QuvBOyYT7OtZTBzy-Wg8zGrUwxSbw@mail.gmail.com>
On Mon, Mar 25, 2019 at 02:45:00PM -0700, Linus Torvalds wrote:
> On Mon, Mar 25, 2019 at 2:14 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Maybe, but we really need to come up with sane documentation on the
> > entire drop_inode/evict_inode/destroy_inode/rcu_destroy_inode
> > group ;-/
>
> Yeah.
>
> I actually think the "destroy_inode/rcu_destroy_inode" part is the
> simplest one to understand: it's just tearing down the inode, and the
> RCU part is (obviously, as shown by this thread) somewhat subtle, but
> at the same time not really all that hard to explain: "inodes that can
> be looked up through RCU lookup need to be destroyed with RCU delay".
... with at least "but remember that call_rcu()-scheduled stuff runs in
softirq, so there's a limit to what you can defer there; furthermore,
there are implications for any spinlocks taken in that callback".
> I think drop_inode->evict_inode is arguably the much more subtle
> stuff. That's the "we're not destroying things, we're making decisions
> about the state" kind of thing.
It's worse than that - another bloody subtle question is the location
of clear_inode() wrt the other work done in ->evict_inode() ;-/
> And we actually don't have any documentation at all for those two.
> Well, there's the "porting" thing, but..
>
> > And I want to understand the writeback-related issues
> > in ocfs2 and f2fs - the current kludges in those smell fishy.
>
> I'm assuming you're talking about exactly that drop_inode() kind of subtlety.
Yes.
> At least for ocfs2, the whole "rcu_destroy_inode()" patch I sent out
> would simplify things a lot. *All* that the ocfs2_destroy_inode()
> function does is to schedule the inode freeing for RCU delay.
>
> Assuming my patch is fine (as mentioned, it was entirely untested),
> that patch would actually simplify that a lot. Get rid of
> ocfs2_destroy_inode() entirely, and just make
> ocfs2_rcu_destroy_inode() do that kmem_cache_free(). Mucho clean-up
> (we have that ocfs2_rcu_destroy_inode() currently as
> ocfs2_i_callback(), but with the ugly rcu-head interface).
Sure, but I wonder if cleaner solution (if any) for whatever is going
on in their drop_inode/evict_inode area might spill into destroy_inode...
I certainly agree that having destroy_inode consist just of call_rcu()
is very common. FWIW, non-trivial exceptions from that pattern are:
fs/afs/super.c:673:static void afs_destroy_inode(struct inode *inode)
fs/btrfs/inode.c:9215:void btrfs_destroy_inode(struct inode *inode)
fs/btrfs/inode.c:9202:void btrfs_test_destroy_inode(struct inode *inode)
fs/ceph/inode.c:530:void ceph_destroy_inode(struct inode *inode)
fs/ecryptfs/super.c:88:static void ecryptfs_destroy_inode(struct inode *inode)
fs/ext4/super.c:1106:static void ext4_destroy_inode(struct inode *inode)
fs/inode.c:228:void free_inode_nonrcu(struct inode *inode)
fs/fuse/inode.c:116:static void fuse_destroy_inode(struct inode *inode)
fs/hugetlbfs/inode.c:1052:static void hugetlbfs_destroy_inode(struct inode *inode)
fs/jfs/super.c:134:static void jfs_destroy_inode(struct inode *inode)
fs/ntfs/inode.c:341:void ntfs_destroy_big_inode(struct inode *inode)
fs/overlayfs/super.c:200:static void ovl_destroy_inode(struct inode *inode)
mm/shmem.c:3646:static void shmem_destroy_inode(struct inode *inode)
net/socket.c:266:static void sock_destroy_inode(struct inode *inode)
fs/ubifs/super.c:282:static void ubifs_destroy_inode(struct inode *inode)
fs/xfs/xfs_super.c:969:xfs_fs_destroy_inode(
Some of those could convert to that pattern, so it's even fewer than
that. And getting rid of container_of in those callbacks would be
nice too. One obvious observation, though - we want the actual
fixes to go before any method changes, for backporting reasons.
For debugfs it's clearly "use default ->evict_inode(), have explicit
->destroy_inode() using free_inode_nonrcu()" - there we have nothing
else done in ->evict_inode() and kfree is obviously safe in softirq.
I'll post that (or push to vfs.git#fixes), along with minimal fixes
for other 3. If bpf_any_put() is softirq-safe, we'll have the full
set for -stable and the rest could be done on top of that.
Won't solve the documetation problem, unfortunately ;-/
next prev parent reply other threads:[~2019-03-25 23:37 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-28 17:40 KASAN: use-after-free Read in path_lookupat syzbot
2019-03-25 0:44 ` syzbot
2019-03-25 1:25 ` Linus Torvalds
2019-03-25 1:23 ` Linus Torvalds
2019-03-25 4:57 ` Al Viro
2019-03-25 9:15 ` Daniel Borkmann
2019-03-25 11:11 ` Al Viro
2019-03-25 11:17 ` Al Viro
2019-03-25 11:21 ` Daniel Borkmann
2019-03-25 18:36 ` Linus Torvalds
2019-03-25 19:18 ` Linus Torvalds
2019-03-25 21:14 ` Al Viro
2019-03-25 21:45 ` Linus Torvalds
2019-03-25 22:04 ` Daniel Borkmann
2019-03-25 22:13 ` Linus Torvalds
2019-03-25 22:41 ` Daniel Borkmann
2019-03-25 22:49 ` Al Viro
2019-03-25 23:37 ` Al Viro [this message]
2019-03-25 23:44 ` Alexei Starovoitov
2019-03-26 0:21 ` Al Viro
2019-03-26 1:38 ` ceph: fix use-after-free on symlink traversal Al Viro
2019-03-26 1:39 ` jffs2: " Al Viro
2019-03-26 1:40 ` ubifs: " Al Viro
2019-03-26 1:43 ` debugfs: " Al Viro
2019-03-26 10:41 ` ceph: " Jeff Layton
2019-03-26 11:38 ` Ilya Dryomov
2019-03-26 1:45 ` KASAN: use-after-free Read in path_lookupat Al Viro
2019-04-10 18:11 ` Al Viro
2019-04-10 19:44 ` Linus Torvalds
2019-03-25 19:43 ` Al Viro
2019-03-25 22:48 ` Dave Chinner
2019-03-25 23:02 ` Al Viro
[not found] ` <CAGe7X7mb=gK7zhSwmT_6mmmkcbjhZAOb=wj31BdUcHkNUPsm2Q@mail.gmail.com>
2019-03-26 4:15 ` Al Viro
2019-03-27 16:58 ` Jan Kara
2019-03-27 18:59 ` Al Viro
2019-03-28 9:00 ` Jan Kara
2019-03-27 17:22 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190325233731.GS2217@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=syzbot+7a8ba368b47fdefca61e@syzkaller.appspotmail.com \
--cc=syzkaller-bugs@googlegroups.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.