From mboxrd@z Thu Jan 1 00:00:00 1970 From: Fengguang Wu Subject: Re: [PATCH] writeback: Fix data corruption on NFS Date: Mon, 13 Jan 2014 19:22:19 +0800 Message-ID: <20140113112219.GC22110@localhost> References: <1386791976-2286-1-git-send-email-jack@suse.cz> <20131214054539.GA21310@localhost> <20140113105826.GA10113@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: linux-fsdevel@vger.kernel.org, Dan Duval , trond.myklebust@primarydata.com, chuck.lever@oracle.com, Linus Torvalds To: Jan Kara Return-path: Received: from mga14.intel.com ([143.182.124.37]:33272 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751381AbaAMLWY (ORCPT ); Mon, 13 Jan 2014 06:22:24 -0500 Content-Disposition: inline In-Reply-To: <20140113105826.GA10113@quack.suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Jan 13, 2014 at 11:58:26AM +0100, Jan Kara wrote: > On Sat 14-12-13 13:45:39, Wu Fengguang wrote: > > On Wed, Dec 11, 2013 at 08:59:36PM +0100, Jan Kara wrote: > > > Commit 4f8ad655dbc8 "writeback: Refactor writeback_single_inode()" added > > > a condition to skip clean inode. However this is wrong in WB_SYNC_ALL > > > mode because there we also want to wait for outstanding writeback on > > > possibly clean inode. This was causing occasional data corruption issues > > > on NFS because it uses sync_inode() to make sure all outstanding writes > > > are flushed to the server before truncating the inode and with > > > sync_inode() returning prematurely file was sometimes extended back > > > by an outstanding write after it was truncated. > > > > > > So modify the test to also check for pages under writeback in > > > WB_SYNC_ALL mode. > > > > Applied to the writeback tree. Thank you, Jan! > Didn't you forget to send pull request? I don't see the patch in mainline > yet and since this is a data corruption issue, it would be nice to get it > out for 3.13... Sorry I scheduled it for the next release.. However you are right that it should have been merged earlier. I should have speak out the plan so that we realize and resolve the different opinions earlier. I'll try to send a pull request to Linus and hope he still accepts patches. Fengguang > > > CC: stable@vger.kernel.org # >= 3.5 > > > Fixes: 4f8ad655dbc82cf05d2edc11e66b78a42d38bf93 > > > Reported-and-tested-by: Dan Duval > > > Signed-off-by: Jan Kara > > > --- > > > fs/fs-writeback.c | 15 +++++++++------ > > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > > > Fenguang, can you please merge this patch? Thanks! > > > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > > index 1f4a10ece2f1..e0259a163f98 100644 > > > --- a/fs/fs-writeback.c > > > +++ b/fs/fs-writeback.c > > > @@ -516,13 +516,16 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb, > > > } > > > WARN_ON(inode->i_state & I_SYNC); > > > /* > > > - * Skip inode if it is clean. We don't want to mess with writeback > > > - * lists in this function since flusher thread may be doing for example > > > - * sync in parallel and if we move the inode, it could get skipped. So > > > - * here we make sure inode is on some writeback list and leave it there > > > - * unless we have completely cleaned the inode. > > > + * Skip inode if it is clean and we have no outstanding writeback in > > > + * WB_SYNC_ALL mode. We don't want to mess with writeback lists in this > > > + * function since flusher thread may be doing for example sync in > > > + * parallel and if we move the inode, it could get skipped. So here we > > > + * make sure inode is on some writeback list and leave it there unless > > > + * we have completely cleaned the inode. > > > */ > > > - if (!(inode->i_state & I_DIRTY)) > > > + if (!(inode->i_state & I_DIRTY) && > > > + (wbc->sync_mode != WB_SYNC_ALL || > > > + !mapping_tagged(inode->i_mapping, PAGECACHE_TAG_WRITEBACK))) > > > goto out; > > > inode->i_state |= I_SYNC; > > > spin_unlock(&inode->i_lock); > > > -- > > > 1.8.1.4 > -- > Jan Kara > SUSE Labs, CR