* [PATCH 6.6 6.7] nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
@ 2024-02-14 11:01 Ryusuke Konishi
2024-02-19 18:29 ` Greg Kroah-Hartman
0 siblings, 1 reply; 2+ messages in thread
From: Ryusuke Konishi @ 2024-02-14 11:01 UTC (permalink / raw)
To: stable, Greg Kroah-Hartman; +Cc: Andrew Morton
commit 38296afe3c6ee07319e01bb249aa4bb47c07b534 upstream.
Syzbot reported a hang issue in migrate_pages_batch() called by mbind()
and nilfs_lookup_dirty_data_buffers() called in the log writer of nilfs2.
While migrate_pages_batch() locks a folio and waits for the writeback to
complete, the log writer thread that should bring the writeback to
completion picks up the folio being written back in
nilfs_lookup_dirty_data_buffers() that it calls for subsequent log
creation and was trying to lock the folio. Thus causing a deadlock.
In the first place, it is unexpected that folios/pages in the middle of
writeback will be updated and become dirty. Nilfs2 adds a checksum to
verify the validity of the log being written and uses it for recovery at
mount, so data changes during writeback are suppressed. Since this is
broken, an unclean shutdown could potentially cause recovery to fail.
Investigation revealed that the root cause is that the wait for writeback
completion in nilfs_page_mkwrite() is conditional, and if the backing
device does not require stable writes, data may be modified without
waiting.
Fix these issues by making nilfs_page_mkwrite() wait for writeback to
finish regardless of the stable write requirement of the backing device.
Link: https://lkml.kernel.org/r/20240131145657.4209-1-konishi.ryusuke@gmail.com
Fixes: 1d1d1a767206 ("mm: only enforce stable page writes if the backing device requires it")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+ee2ae68da3b22d04cd8d@syzkaller.appspotmail.com
Closes: https://lkml.kernel.org/r/00000000000047d819061004ad6c@google.com
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
Please apply this patch to the stable trees indicated by the subject line
prefix.
These versions do not yet have page-to-folio conversion applied to the
target function, so page-based "wait_on_page_writeback()" is used instead
of "folio_wait_writeback()" in this patch. This did not apply as-is to
v6.5 and earlier versions due to an fs-wide change. So I would like to
post a separate patch for earlier stable trees.
Thanks,
Ryusuke Konishi
fs/nilfs2/file.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c
index 740ce26d1e76..0505feef79f4 100644
--- a/fs/nilfs2/file.c
+++ b/fs/nilfs2/file.c
@@ -105,7 +105,13 @@ static vm_fault_t nilfs_page_mkwrite(struct vm_fault *vmf)
nilfs_transaction_commit(inode->i_sb);
mapped:
- wait_for_stable_page(page);
+ /*
+ * Since checksumming including data blocks is performed to determine
+ * the validity of the log to be written and used for recovery, it is
+ * necessary to wait for writeback to finish here, regardless of the
+ * stable write requirement of the backing device.
+ */
+ wait_on_page_writeback(page);
out:
sb_end_pagefault(inode->i_sb);
return vmf_fs_error(ret);
--
2.39.3
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH 6.6 6.7] nilfs2: fix hang in nilfs_lookup_dirty_data_buffers()
2024-02-14 11:01 [PATCH 6.6 6.7] nilfs2: fix hang in nilfs_lookup_dirty_data_buffers() Ryusuke Konishi
@ 2024-02-19 18:29 ` Greg Kroah-Hartman
0 siblings, 0 replies; 2+ messages in thread
From: Greg Kroah-Hartman @ 2024-02-19 18:29 UTC (permalink / raw)
To: Ryusuke Konishi; +Cc: stable, Andrew Morton
On Wed, Feb 14, 2024 at 08:01:10PM +0900, Ryusuke Konishi wrote:
> commit 38296afe3c6ee07319e01bb249aa4bb47c07b534 upstream.
>
> Syzbot reported a hang issue in migrate_pages_batch() called by mbind()
> and nilfs_lookup_dirty_data_buffers() called in the log writer of nilfs2.
>
> While migrate_pages_batch() locks a folio and waits for the writeback to
> complete, the log writer thread that should bring the writeback to
> completion picks up the folio being written back in
> nilfs_lookup_dirty_data_buffers() that it calls for subsequent log
> creation and was trying to lock the folio. Thus causing a deadlock.
>
> In the first place, it is unexpected that folios/pages in the middle of
> writeback will be updated and become dirty. Nilfs2 adds a checksum to
> verify the validity of the log being written and uses it for recovery at
> mount, so data changes during writeback are suppressed. Since this is
> broken, an unclean shutdown could potentially cause recovery to fail.
>
> Investigation revealed that the root cause is that the wait for writeback
> completion in nilfs_page_mkwrite() is conditional, and if the backing
> device does not require stable writes, data may be modified without
> waiting.
>
> Fix these issues by making nilfs_page_mkwrite() wait for writeback to
> finish regardless of the stable write requirement of the backing device.
>
> Link: https://lkml.kernel.org/r/20240131145657.4209-1-konishi.ryusuke@gmail.com
> Fixes: 1d1d1a767206 ("mm: only enforce stable page writes if the backing device requires it")
> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
> Reported-by: syzbot+ee2ae68da3b22d04cd8d@syzkaller.appspotmail.com
> Closes: https://lkml.kernel.org/r/00000000000047d819061004ad6c@google.com
> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> Please apply this patch to the stable trees indicated by the subject line
> prefix.
>
> These versions do not yet have page-to-folio conversion applied to the
> target function, so page-based "wait_on_page_writeback()" is used instead
> of "folio_wait_writeback()" in this patch. This did not apply as-is to
> v6.5 and earlier versions due to an fs-wide change. So I would like to
> post a separate patch for earlier stable trees.
All now queued up, thanks!
greg k-h
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-02-19 18:29 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-14 11:01 [PATCH 6.6 6.7] nilfs2: fix hang in nilfs_lookup_dirty_data_buffers() Ryusuke Konishi
2024-02-19 18:29 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox