linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: "Myklebust, Trond" <Trond.Myklebust@netapp.com>
Cc: NFS <linux-nfs@vger.kernel.org>
Subject: NFS regression - EIO is returned instead of ENOSPC
Date: Wed, 12 Dec 2012 09:28:13 +1100	[thread overview]
Message-ID: <20121212092813.23afbc5f@notabene.brown> (raw)

[-- Attachment #1: Type: text/plain, Size: 4889 bytes --]


Hi Trond et al,
 we seem to have a regression introduced by 

commit 7b281ee026552f10862b617a2a51acf49c829554
    NFS: fsync() must exit with an error if page writeback failed

which has found it's way (in different form into -stable releases).

The problem is that an NFSERR_NOSPC comes through as EIO.

e.g. if /mnt2/ if an nfs mounted filesystem that has no space then

strace dd if=/dev/zero conv=fsync >> /mnt2/afile count=1

reported Input/output error and the relevant parts of the strace output are:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
fsync(1)                                = -1 EIO (Input/output error)
close(1)                                = -1 ENOSPC (No space left on device)

i.e. we get an EIO from fsync, then the ENOSPC comes with the close.

If don't do the fsync, then:

strace dd if=/dev/zero >> /mnt2/afile count=1


write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
close(1)                                = -1 EIO (Input/output error)


we don't see the ENOSPC at all.

The problem is that filemap_fdatawait_range, when it sees a page with
PageError set, will return -EIO unless AS_ENOSPC is set.  NFS never sets
AS_ENOSPC so we get -EIO.

Then nfs_file_fsync() will return as soon as it sees an error from
filemap_write_and_wait_range(), which will be that EIO.

nfs_file_fsync_commit knows to prefer the error from ctx->error over any
other error, but nfs_file_fsync() doesn't.

I see two ways to "fix" this.

We could get nfs{,4}_file_fsync() to always call nfs_file_fsync_commit() and
use the error from the later in preference to the error from
filemap_write_and_wait_range().

That results in this, more correct, behaviour:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
fsync(1)                                = -1 ENOSPC (No space left on device)
close(1)                                = 0

Or we could get nfs_context_set_write_error() to call mapping_set_error(),
which would result in AS_ENOSPC being set and so
filemap_write_and_wait_range() will return the "correct" error.
This results in this behaviour:

write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 512) = 512
fsync(1)                                = -1 ENOSPC (No space left on device)
close(1)                                = -1 ENOSPC (No space left on device)

which is a bit odd.  The first ENOSPC is from AS_ENOSPC.  The second is from
ctx->error.

I feel that calling mapping_set_error() is the "right" thing to do, but it
would need a bit more work to avoid the double errors.

What approach would you prefer?

The two patches I used are:

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 582bb88..19c06b9 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -295,12 +295,12 @@ nfs_file_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	struct inode *inode = file->f_path.dentry->d_inode;
 
 	do {
-		ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
-		if (ret != 0)
-			break;
+		int ret1 = filemap_write_and_wait_range(inode->i_mapping, start, end);
 		mutex_lock(&inode->i_mutex);
 		ret = nfs_file_fsync_commit(file, start, end, datasync);
 		mutex_unlock(&inode->i_mutex);
+		if (ret == 0)
+			ret = ret1;
 		/*
 		 * If nfs_file_fsync_commit detected a server reboot, then
 		 * resend all dirty pages that might have been covered by
diff --git a/fs/nfs/nfs4file.c b/fs/nfs/nfs4file.c
index afddd66..f0d9a88 100644
--- a/fs/nfs/nfs4file.c
+++ b/fs/nfs/nfs4file.c
@@ -96,15 +96,15 @@ nfs4_file_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	struct inode *inode = file->f_path.dentry->d_inode;
 
 	do {
-		ret = filemap_write_and_wait_range(inode->i_mapping, start, end);
-		if (ret != 0)
-			break;
+		int ret1 = filemap_write_and_wait_range(inode->i_mapping, start, end);
 		mutex_lock(&inode->i_mutex);
 		ret = nfs_file_fsync_commit(file, start, end, datasync);
 		if (!ret && !datasync)
 			/* application has asked for meta-data sync */
 			ret = pnfs_layoutcommit_inode(inode, true);
 		mutex_unlock(&inode->i_mutex);
+		if (ret == 0)
+			ret = ret1;
 		/*
 		 * If nfs_file_fsync_commit detected a server reboot, then
 		 * resend all dirty pages that might have been covered by


and 



diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 9347ab7..b0f5bb7 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -140,6 +140,7 @@ static void nfs_context_set_write_error(struct nfs_open_context *ctx, int error)
 	ctx->error = error;
 	smp_wmb();
 	set_bit(NFS_CONTEXT_ERROR_WRITE, &ctx->flags);
+	mapping_set_error(ctx->dentry->d_inode->i_mapping, error);
 }
 
 static struct nfs_page *



Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

             reply	other threads:[~2012-12-11 22:28 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-11 22:28 NeilBrown [this message]
2012-12-11 22:53 ` NFS regression - EIO is returned instead of ENOSPC NeilBrown
2012-12-11 23:16   ` Myklebust, Trond
     [not found]   ` <1355267807.23203.16.camel@lade.trondhjem.org>
2012-12-11 23:20     ` Myklebust, Trond
2012-12-12  0:05       ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121212092813.23afbc5f@notabene.brown \
    --to=neilb@suse.de \
    --cc=Trond.Myklebust@netapp.com \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).