From mboxrd@z Thu Jan 1 00:00:00 1970 From: John Hubbard Date: Fri, 25 Feb 2022 13:33:33 -0800 Subject: [Cluster-devel] [PATCH -v3] ext4: don't BUG if kernel subsystems dirty pages without asking ext4 first In-Reply-To: References: Message-ID: <2f9933b3-a574-23e1-e632-72fc29e582cf@nvidia.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On 2/25/22 13:23, Theodore Ts'o wrote: > [un]pin_user_pages_remote is dirtying pages without properly warning > the file system in advance. This was noted by Jan Kara in 2018[1] and In 2018, [un]pin_user_pages_remote did not exist. And so what Jan reported was actually that dio_bio_complete() was calling set_page_dirty_lock() on pages that were not (any longer) set up for that. > more recently has resulted in bug reports by Syzbot in various Android > kernels[2]. > > This is technically a bug in mm/gup.c, but arguably ext4 is fragile in Is it, really? unpin_user_pages_dirty_lock() moved the set_page_dirty_lock() call into mm/gup.c, but that merely refactored things. The callers are all over the kernel, and those callers are what need changing in order to fix this. thanks, -- John Hubbard NVIDIA > that a buggy kernel subsystem which dirty pages without properly > notifying the file system causes ext4 to BUG, while other file systems > are not (although user data likely will be lost). I suspect in real > life it is rare that people are using RDMA into file-backed memory, > which is why no one has complained to ext4 developers except fuzzing > programs. > > So instead of crashing with a BUG, issue a warning (since there may be > potential data loss) and just mark the page as clean to avoid > unprivileged denial of service attacks until the problem can be > properly fixed. More discussion and background can be found in the > thread starting at [2]. > > [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911 at quack2.suse.cz > [2] https://lore.kernel.org/r/Yg0m6IjcNmfaSokM at google.com > > Reported-by: syzbot+d59332e2db681cf18f0318a06e994ebbb529a8db at syzkaller.appspotmail.com > Reported-by: Lee Jones > Signed-off-by: Theodore Ts'o > --- > fs/ext4/inode.c | 27 ++++++++++++++++++++++++++- > 1 file changed, 26 insertions(+), 1 deletion(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 01c9e4f743ba..008fe8750109 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -1993,6 +1993,15 @@ static int ext4_writepage(struct page *page, > else > len = PAGE_SIZE; > > + /* Should never happen but for bugs in other kernel subsystems */ > + if (!page_has_buffers(page)) { > + ext4_warning_inode(inode, > + "page %lu does not have buffers attached", page->index); > + ClearPageDirty(page); > + unlock_page(page); > + return 0; > + } > + > page_bufs = page_buffers(page); > /* > * We cannot do block allocation or other extent handling in this > @@ -2588,12 +2597,28 @@ static int mpage_prepare_extent_to_map(struct mpage_da_data *mpd) > (mpd->wbc->sync_mode == WB_SYNC_NONE)) || > unlikely(page->mapping != mapping)) { > unlock_page(page); > - continue; > + goto out; > } > > wait_on_page_writeback(page); > BUG_ON(PageWriteback(page)); > > + /* > + * Should never happen but for buggy code in > + * other subsystems that call > + * set_page_dirty() without properly warning > + * the file system first. See [1] for more > + * information. > + * > + * [1] https://lore.kernel.org/linux-mm/20180103100430.GE4911 at quack2.suse.cz > + */ > + if (!page_has_buffers(page)) { > + ext4_warning_inode(mpd->inode, "page %lu does not have buffers attached", page->index); > + ClearPageDirty(page); > + unlock_page(page); > + continue; > + } > + > if (mpd->map.m_len == 0) > mpd->first_page = page->index; > mpd->next_page = page->index + 1;