* [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion.
@ 2024-02-27 23:23 NeilBrown
2024-03-06 13:42 ` Jeff Layton
0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2024-02-27 23:23 UTC (permalink / raw)
To: stable, Trond Myklebust, Anna Schumaker; +Cc: linux-nfs
when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects
congestion) it is important that the folio is redirtied.
nfs_writepage_locked() doesn't do this, so files can become corrupted as
writes can be lost.
Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be
returned. It is needed for kernels v5.18..v6.7. Prior to 6.3 the patch
is different as it needs to mention "page", not "folio".
Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm>
Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion")
Signed-off-by: NeilBrown <neilb@suse.de>
---
fs/nfs/write.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index b664caea8b4e..9e345d3c305a 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio,
int err;
if (wbc->sync_mode == WB_SYNC_NONE &&
- NFS_SERVER(inode)->write_congested)
+ NFS_SERVER(inode)->write_congested) {
+ folio_redirty_for_writepage(wbc, folio);
return AOP_WRITEPAGE_ACTIVATE;
+ }
nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE);
nfs_pageio_init_write(&pgio, inode, 0, false,
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion. 2024-02-27 23:23 [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion NeilBrown @ 2024-03-06 13:42 ` Jeff Layton 2024-03-06 17:12 ` Jeff Layton 2024-03-07 11:41 ` NeilBrown 0 siblings, 2 replies; 5+ messages in thread From: Jeff Layton @ 2024-03-06 13:42 UTC (permalink / raw) To: NeilBrown, stable, Trond Myklebust, Anna Schumaker; +Cc: linux-nfs, Dan Aloni On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote: > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects > congestion) it is important that the folio is redirtied. > nfs_writepage_locked() doesn't do this, so files can become corrupted as > writes can be lost. > > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be > returned. It is needed for kernels v5.18..v6.7. Prior to 6.3 the patch > is different as it needs to mention "page", not "folio". > Neil, I have a question about the above statement. In Linus's tree as of this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked: if (wbc->sync_mode == WB_SYNC_NONE && NFS_SERVER(inode)->write_congested) return AOP_WRITEPAGE_ACTIVATE; The only caller of nfs_writepages_locked, and I don't see where it redirties the page. Why don't we need this in v6.8? > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm> > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion") > Signed-off-by: NeilBrown <neilb@suse.de> > --- > fs/nfs/write.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > index b664caea8b4e..9e345d3c305a 100644 > --- a/fs/nfs/write.c > +++ b/fs/nfs/write.c > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio, > int err; > > if (wbc->sync_mode == WB_SYNC_NONE && > - NFS_SERVER(inode)->write_congested) > + NFS_SERVER(inode)->write_congested) { > + folio_redirty_for_writepage(wbc, folio); > return AOP_WRITEPAGE_ACTIVATE; > + } > > nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE); > nfs_pageio_init_write(&pgio, inode, 0, false, -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion. 2024-03-06 13:42 ` Jeff Layton @ 2024-03-06 17:12 ` Jeff Layton 2024-03-07 11:41 ` NeilBrown 1 sibling, 0 replies; 5+ messages in thread From: Jeff Layton @ 2024-03-06 17:12 UTC (permalink / raw) To: NeilBrown, stable, Trond Myklebust, Anna Schumaker; +Cc: linux-nfs, Dan Aloni On Wed, 2024-03-06 at 08:42 -0500, Jeff Layton wrote: > On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote: > > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects > > congestion) it is important that the folio is redirtied. > > nfs_writepage_locked() doesn't do this, so files can become corrupted as > > writes can be lost. > > > > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be > > returned. It is needed for kernels v5.18..v6.7. Prior to 6.3 the patch > > is different as it needs to mention "page", not "folio". > > > > Neil, I have a question about the above statement. In Linus's tree as of > this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked: > > if (wbc->sync_mode == WB_SYNC_NONE && > NFS_SERVER(inode)->write_congested) > return AOP_WRITEPAGE_ACTIVATE; > Sorry, I meant to say: The only caller of nfs_writepages_locked is nfs_wb_folio, and I don't see where it redirties the folio. Why don't we need this in v6.8? > > > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm> > > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion") > > Signed-off-by: NeilBrown <neilb@suse.de> > > --- > > fs/nfs/write.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > > index b664caea8b4e..9e345d3c305a 100644 > > --- a/fs/nfs/write.c > > +++ b/fs/nfs/write.c > > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio, > > int err; > > > > if (wbc->sync_mode == WB_SYNC_NONE && > > - NFS_SERVER(inode)->write_congested) > > + NFS_SERVER(inode)->write_congested) { > > + folio_redirty_for_writepage(wbc, folio); > > return AOP_WRITEPAGE_ACTIVATE; > > + } > > > > nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE); > > nfs_pageio_init_write(&pgio, inode, 0, false, > -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion. 2024-03-06 13:42 ` Jeff Layton 2024-03-06 17:12 ` Jeff Layton @ 2024-03-07 11:41 ` NeilBrown 2024-03-07 12:30 ` Jeff Layton 1 sibling, 1 reply; 5+ messages in thread From: NeilBrown @ 2024-03-07 11:41 UTC (permalink / raw) To: Jeff Layton; +Cc: stable, Trond Myklebust, Anna Schumaker, linux-nfs, Dan Aloni On Thu, 07 Mar 2024, Jeff Layton wrote: > On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote: > > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects > > congestion) it is important that the folio is redirtied. > > nfs_writepage_locked() doesn't do this, so files can become corrupted as > > writes can be lost. > > > > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be > > returned. It is needed for kernels v5.18..v6.7. Prior to 6.3 the patch > > is different as it needs to mention "page", not "folio". > > > > Neil, I have a question about the above statement. In Linus's tree as of > this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked: > > if (wbc->sync_mode == WB_SYNC_NONE && > NFS_SERVER(inode)->write_congested) > return AOP_WRITEPAGE_ACTIVATE; > > The only caller of nfs_writepages_locked, and I don't see where it > redirties the page. Why don't we need this in v6.8? You are right - it doesn't redirty anything. But there is no bug here.... I didn't see it at first either, but the only caller of nfs_writepage_locked() is nfs_wb_folio() (as you say) and that always passes a wbc with .sync_mode = WB_SYNC_ALL. So sync_mode is never WB_SYNC_NODE and the code snippet you included above is dead code. I've already posted a patch to Trond and Anna to remove that code. Thanks for the review! NeilBrown > > > > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm> > > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion") > > Signed-off-by: NeilBrown <neilb@suse.de> > > --- > > fs/nfs/write.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > > index b664caea8b4e..9e345d3c305a 100644 > > --- a/fs/nfs/write.c > > +++ b/fs/nfs/write.c > > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio, > > int err; > > > > if (wbc->sync_mode == WB_SYNC_NONE && > > - NFS_SERVER(inode)->write_congested) > > + NFS_SERVER(inode)->write_congested) { > > + folio_redirty_for_writepage(wbc, folio); > > return AOP_WRITEPAGE_ACTIVATE; > > + } > > > > nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE); > > nfs_pageio_init_write(&pgio, inode, 0, false, > > -- > Jeff Layton <jlayton@kernel.org> > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion. 2024-03-07 11:41 ` NeilBrown @ 2024-03-07 12:30 ` Jeff Layton 0 siblings, 0 replies; 5+ messages in thread From: Jeff Layton @ 2024-03-07 12:30 UTC (permalink / raw) To: NeilBrown; +Cc: stable, Trond Myklebust, Anna Schumaker, linux-nfs, Dan Aloni On Thu, 2024-03-07 at 22:41 +1100, NeilBrown wrote: > On Thu, 07 Mar 2024, Jeff Layton wrote: > > On Wed, 2024-02-28 at 10:23 +1100, NeilBrown wrote: > > > when AOP_WRITEPAGE_ACTIVATE is returned (as NFS does when it detects > > > congestion) it is important that the folio is redirtied. > > > nfs_writepage_locked() doesn't do this, so files can become corrupted as > > > writes can be lost. > > > > > > Note that this is not needed in v6.8 as AOP_WRITEPAGE_ACTIVATE cannot be > > > returned. It is needed for kernels v5.18..v6.7. Prior to 6.3 the patch > > > is different as it needs to mention "page", not "folio". > > > > > > > Neil, I have a question about the above statement. In Linus's tree as of > > this morning (v6.8-rc7-ish), it does this in nfs_writepages_locked: > > > > if (wbc->sync_mode == WB_SYNC_NONE && > > NFS_SERVER(inode)->write_congested) > > return AOP_WRITEPAGE_ACTIVATE; > > > > The only caller of nfs_writepages_locked, and I don't see where it > > redirties the page. Why don't we need this in v6.8? > > You are right - it doesn't redirty anything. But there is no bug > here.... > I didn't see it at first either, but the only caller of > nfs_writepage_locked() is nfs_wb_folio() (as you say) and that always > passes a wbc with .sync_mode = WB_SYNC_ALL. So sync_mode is never > WB_SYNC_NODE and the code snippet you included above is dead code. I've > already posted a patch to Trond and Anna to remove that code. > > Thanks for the review! > Thanks Neil, I missed that bit about the sync_mode. I sent a R-b for your other patch too. Cheers, Jeff > > > > > > > Reported-and-tested-by: Jacek Tomaka <Jacek.Tomaka@poczta.fm> > > > Fixes: 6df25e58532b ("nfs: remove reliance on bdi congestion") > > > Signed-off-by: NeilBrown <neilb@suse.de> > > > --- > > > fs/nfs/write.c | 4 +++- > > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/nfs/write.c b/fs/nfs/write.c > > > index b664caea8b4e..9e345d3c305a 100644 > > > --- a/fs/nfs/write.c > > > +++ b/fs/nfs/write.c > > > @@ -668,8 +668,10 @@ static int nfs_writepage_locked(struct folio *folio, > > > int err; > > > > > > if (wbc->sync_mode == WB_SYNC_NONE && > > > - NFS_SERVER(inode)->write_congested) > > > + NFS_SERVER(inode)->write_congested) { > > > + folio_redirty_for_writepage(wbc, folio); > > > return AOP_WRITEPAGE_ACTIVATE; > > > + } > > > > > > nfs_inc_stats(inode, NFSIOS_VFSWRITEPAGE); > > > nfs_pageio_init_write(&pgio, inode, 0, false, > > > > -- > > Jeff Layton <jlayton@kernel.org> > > > -- Jeff Layton <jlayton@kernel.org> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-07 12:30 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-27 23:23 [PATCH stable 6.6 and 6.7] NFS: Fix data corruption caused by congestion NeilBrown 2024-03-06 13:42 ` Jeff Layton 2024-03-06 17:12 ` Jeff Layton 2024-03-07 11:41 ` NeilBrown 2024-03-07 12:30 ` Jeff Layton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).