* [PATCH v2] nfsd: special case truncates some more
@ 2017-01-24 8:22 Christoph Hellwig
2017-01-24 14:06 ` Jeff Layton
2017-02-08 21:27 ` NFS server regression in v4.10-rc7 Chuck Lever
0 siblings, 2 replies; 11+ messages in thread
From: Christoph Hellwig @ 2017-01-24 8:22 UTC (permalink / raw)
To: bfields, jlayton; +Cc: trondmy, linux-nfs
Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributs to set to set various file attributes including the
file size and the uid/gid.
The Linux syscalls never mixe size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates might not update random other attributes, and many
other file systems handle the case but do not update the different
attributes in the same transaction. NFSD on the other hand passes
the attributes it gets on the wire more or less directly through to
the VFS, leading to updates the file systems don't expect. XFS at
least has an assert on the allowed attributes, which cought an NFS
client sets the size and group at the same time.
To handles this issue properly this switches nfsd to call vfs_truncate
for size changes, and then handling all other attributes through
notify_change. As a side effect this also means less boilerplace
code around the size change as we can now reuse the VFS code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
Changes since V1:
- avoid an extra setattr just for a MTIME attribute on the wire
fs/nfsd/vfs.c | 97 +++++++++++++++++++++++------------------------------------
1 file changed, 37 insertions(+), 60 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 26c6fdb..e91107a 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap)
}
}
-static __be32
-nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
- struct iattr *iap)
-{
- struct inode *inode = d_inode(fhp->fh_dentry);
- int host_err;
-
- if (iap->ia_size < inode->i_size) {
- __be32 err;
-
- err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
- NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE);
- if (err)
- return err;
- }
-
- host_err = get_write_access(inode);
- if (host_err)
- goto out_nfserrno;
-
- host_err = locks_verify_truncate(inode, NULL, iap->ia_size);
- if (host_err)
- goto out_put_write_access;
- return 0;
-
-out_put_write_access:
- put_write_access(inode);
-out_nfserrno:
- return nfserrno(host_err);
-}
-
/*
* Set various file attributes. After this call fhp needs an fh_put.
*/
@@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
__be32 err;
int host_err;
bool get_write_count;
- int size_change = 0;
if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE;
@@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
/* Get inode */
err = fh_verify(rqstp, fhp, ftype, accmode);
if (err)
- goto out;
+ return err;
if (get_write_count) {
host_err = fh_want_write(fhp);
if (host_err)
- return nfserrno(host_err);
+ goto out_host_err;
}
dentry = fhp->fh_dentry;
@@ -405,50 +373,59 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
iap->ia_valid &= ~ATTR_MODE;
if (!iap->ia_valid)
- goto out;
+ return 0;
nfsd_sanitize_attrs(inode, iap);
+ if (check_guard && guardtime != inode->i_ctime.tv_sec)
+ return nfserr_notsync;
+
/*
* The size case is special, it changes the file in addition to the
- * attributes.
+ * attributes, and file systems don't expect it to be mixed with
+ * "random" attribute changes. We thus split out the size change
+ * into a separate calo for vfs_truncate, and do the rest as a
+ * a separate setattr call.
*/
if (iap->ia_valid & ATTR_SIZE) {
- err = nfsd_get_write_access(rqstp, fhp, iap);
- if (err)
- goto out;
- size_change = 1;
+ struct path path = {
+ .mnt = fhp->fh_export->ex_path.mnt,
+ .dentry = dentry,
+ };
+ bool implicit_mtime = false;
/*
- * RFC5661, Section 18.30.4:
- * Changing the size of a file with SETATTR indirectly
- * changes the time_modify and change attributes.
- *
- * (and similar for the older RFCs)
+ * vfs_truncate implicity updates the mtime IFF the file size
+ * actually changes. Avoid the additional seattr call below if
+ * the only other attribute that the client sends is the mtime.
*/
- if (iap->ia_size != i_size_read(inode))
- iap->ia_valid |= ATTR_MTIME;
- }
+ if (iap->ia_size != i_size_read(inode) &&
+ ((iap->ia_valid & ~(ATTR_SIZE | ATTR_MTIME)) == 0))
+ implicit_mtime = true;
- iap->ia_valid |= ATTR_CTIME;
+ host_err = vfs_truncate(&path, iap->ia_size);
+ if (host_err)
+ goto out_host_err;
- if (check_guard && guardtime != inode->i_ctime.tv_sec) {
- err = nfserr_notsync;
- goto out_put_write_access;
+ iap->ia_valid &= ~ATTR_SIZE;
+ if (implicit_mtime)
+ iap->ia_valid &= ~ATTR_MTIME;
+ if (!iap->ia_valid)
+ goto done;
}
+ iap->ia_valid |= ATTR_CTIME;
+
fh_lock(fhp);
host_err = notify_change(dentry, iap, NULL);
fh_unlock(fhp);
- err = nfserrno(host_err);
+ if (host_err)
+ goto out_host_err;
-out_put_write_access:
- if (size_change)
- put_write_access(inode);
- if (!err)
- err = nfserrno(commit_metadata(fhp));
-out:
- return err;
+done:
+ host_err = commit_metadata(fhp);
+out_host_err:
+ return nfserrno(host_err);
}
#if defined(CONFIG_NFSD_V4)
--
2.1.4
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2] nfsd: special case truncates some more
2017-01-24 8:22 [PATCH v2] nfsd: special case truncates some more Christoph Hellwig
@ 2017-01-24 14:06 ` Jeff Layton
2017-01-24 22:03 ` J. Bruce Fields
2017-02-08 21:27 ` NFS server regression in v4.10-rc7 Chuck Lever
1 sibling, 1 reply; 11+ messages in thread
From: Jeff Layton @ 2017-01-24 14:06 UTC (permalink / raw)
To: Christoph Hellwig, bfields; +Cc: trondmy, linux-nfs
On Tue, 2017-01-24 at 09:22 +0100, Christoph Hellwig wrote:
> Both the NFS protocols and the Linux VFS use a setattr operation with a
> bitmap of attributs to set to set various file attributes including the
> file size and the uid/gid.
>
> The Linux syscalls never mixe size updates with unrelated updates like
> the uid/gid, and some file systems like XFS and GFS2 rely on the fact
> that truncates might not update random other attributes, and many
> other file systems handle the case but do not update the different
> attributes in the same transaction. NFSD on the other hand passes
> the attributes it gets on the wire more or less directly through to
> the VFS, leading to updates the file systems don't expect. XFS at
> least has an assert on the allowed attributes, which cought an NFS
> client sets the size and group at the same time.
>
> To handles this issue properly this switches nfsd to call vfs_truncate
> for size changes, and then handling all other attributes through
> notify_change. As a side effect this also means less boilerplace
> code around the size change as we can now reuse the VFS code.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>
> Changes since V1:
> - avoid an extra setattr just for a MTIME attribute on the wire
>
> fs/nfsd/vfs.c | 97 +++++++++++++++++++++++------------------------------------
> 1 file changed, 37 insertions(+), 60 deletions(-)
>
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 26c6fdb..e91107a 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap)
> }
> }
>
> -static __be32
> -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
> - struct iattr *iap)
> -{
> - struct inode *inode = d_inode(fhp->fh_dentry);
> - int host_err;
> -
> - if (iap->ia_size < inode->i_size) {
> - __be32 err;
> -
> - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
> - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE);
> - if (err)
> - return err;
> - }
> -
> - host_err = get_write_access(inode);
> - if (host_err)
> - goto out_nfserrno;
> -
> - host_err = locks_verify_truncate(inode, NULL, iap->ia_size);
> - if (host_err)
> - goto out_put_write_access;
> - return 0;
> -
> -out_put_write_access:
> - put_write_access(inode);
> -out_nfserrno:
> - return nfserrno(host_err);
> -}
> -
> /*
> * Set various file attributes. After this call fhp needs an fh_put.
> */
> @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> __be32 err;
> int host_err;
> bool get_write_count;
> - int size_change = 0;
>
> if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE;
> @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> /* Get inode */
> err = fh_verify(rqstp, fhp, ftype, accmode);
> if (err)
> - goto out;
> + return err;
> if (get_write_count) {
> host_err = fh_want_write(fhp);
> if (host_err)
> - return nfserrno(host_err);
> + goto out_host_err;
> }
>
> dentry = fhp->fh_dentry;
> @@ -405,50 +373,59 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> iap->ia_valid &= ~ATTR_MODE;
>
> if (!iap->ia_valid)
> - goto out;
> + return 0;
>
> nfsd_sanitize_attrs(inode, iap);
>
> + if (check_guard && guardtime != inode->i_ctime.tv_sec)
> + return nfserr_notsync;
> +
> /*
> * The size case is special, it changes the file in addition to the
> - * attributes.
> + * attributes, and file systems don't expect it to be mixed with
> + * "random" attribute changes. We thus split out the size change
> + * into a separate calo for vfs_truncate, and do the rest as a
> + * a separate setattr call.
> */
> if (iap->ia_valid & ATTR_SIZE) {
> - err = nfsd_get_write_access(rqstp, fhp, iap);
> - if (err)
> - goto out;
> - size_change = 1;
> + struct path path = {
> + .mnt = fhp->fh_export->ex_path.mnt,
> + .dentry = dentry,
> + };
> + bool implicit_mtime = false;
>
> /*
> - * RFC5661, Section 18.30.4:
> - * Changing the size of a file with SETATTR indirectly
> - * changes the time_modify and change attributes.
> - *
> - * (and similar for the older RFCs)
> + * vfs_truncate implicity updates the mtime IFF the file size
> + * actually changes. Avoid the additional seattr call below if
> + * the only other attribute that the client sends is the mtime.
> */
> - if (iap->ia_size != i_size_read(inode))
> - iap->ia_valid |= ATTR_MTIME;
> - }
> + if (iap->ia_size != i_size_read(inode) &&
> + ((iap->ia_valid & ~(ATTR_SIZE | ATTR_MTIME)) == 0))
> + implicit_mtime = true;
>
> - iap->ia_valid |= ATTR_CTIME;
> + host_err = vfs_truncate(&path, iap->ia_size);
> + if (host_err)
> + goto out_host_err;
>
> - if (check_guard && guardtime != inode->i_ctime.tv_sec) {
> - err = nfserr_notsync;
> - goto out_put_write_access;
> + iap->ia_valid &= ~ATTR_SIZE;
> + if (implicit_mtime)
> + iap->ia_valid &= ~ATTR_MTIME;
> + if (!iap->ia_valid)
> + goto done;
> }
>
> + iap->ia_valid |= ATTR_CTIME;
> +
> fh_lock(fhp);
> host_err = notify_change(dentry, iap, NULL);
> fh_unlock(fhp);
> - err = nfserrno(host_err);
> + if (host_err)
> + goto out_host_err;
>
> -out_put_write_access:
> - if (size_change)
> - put_write_access(inode);
> - if (!err)
> - err = nfserrno(commit_metadata(fhp));
> -out:
> - return err;
> +done:
> + host_err = commit_metadata(fhp);
> +out_host_err:
> + return nfserrno(host_err);
> }
>
> #if defined(CONFIG_NFSD_V4)
Reviewed-by: Jeff Layton <jlayton@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] nfsd: special case truncates some more
2017-01-24 14:06 ` Jeff Layton
@ 2017-01-24 22:03 ` J. Bruce Fields
0 siblings, 0 replies; 11+ messages in thread
From: J. Bruce Fields @ 2017-01-24 22:03 UTC (permalink / raw)
To: Jeff Layton; +Cc: Christoph Hellwig, trondmy, linux-nfs
On Tue, Jan 24, 2017 at 09:06:04AM -0500, Jeff Layton wrote:
> On Tue, 2017-01-24 at 09:22 +0100, Christoph Hellwig wrote:
> > Both the NFS protocols and the Linux VFS use a setattr operation with a
> > bitmap of attributs to set to set various file attributes including the
> > file size and the uid/gid.
> >
> > The Linux syscalls never mixe size updates with unrelated updates like
> > the uid/gid, and some file systems like XFS and GFS2 rely on the fact
> > that truncates might not update random other attributes, and many
> > other file systems handle the case but do not update the different
> > attributes in the same transaction. NFSD on the other hand passes
> > the attributes it gets on the wire more or less directly through to
> > the VFS, leading to updates the file systems don't expect. XFS at
> > least has an assert on the allowed attributes, which cought an NFS
> > client sets the size and group at the same time.
> >
> > To handles this issue properly this switches nfsd to call vfs_truncate
> > for size changes, and then handling all other attributes through
> > notify_change. As a side effect this also means less boilerplace
> > code around the size change as we can now reuse the VFS code.
> >
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >
> > Changes since V1:
> > - avoid an extra setattr just for a MTIME attribute on the wire
> >
> > fs/nfsd/vfs.c | 97 +++++++++++++++++++++++------------------------------------
> > 1 file changed, 37 insertions(+), 60 deletions(-)
> >
> > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> > index 26c6fdb..e91107a 100644
> > --- a/fs/nfsd/vfs.c
> > +++ b/fs/nfsd/vfs.c
> > @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr *iap)
> > }
> > }
> >
> > -static __be32
> > -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
> > - struct iattr *iap)
> > -{
> > - struct inode *inode = d_inode(fhp->fh_dentry);
> > - int host_err;
> > -
> > - if (iap->ia_size < inode->i_size) {
> > - __be32 err;
> > -
> > - err = nfsd_permission(rqstp, fhp->fh_export, fhp->fh_dentry,
> > - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE);
> > - if (err)
> > - return err;
> > - }
> > -
> > - host_err = get_write_access(inode);
> > - if (host_err)
> > - goto out_nfserrno;
> > -
> > - host_err = locks_verify_truncate(inode, NULL, iap->ia_size);
> > - if (host_err)
> > - goto out_put_write_access;
> > - return 0;
> > -
> > -out_put_write_access:
> > - put_write_access(inode);
> > -out_nfserrno:
> > - return nfserrno(host_err);
> > -}
> > -
> > /*
> > * Set various file attributes. After this call fhp needs an fh_put.
> > */
> > @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> > __be32 err;
> > int host_err;
> > bool get_write_count;
> > - int size_change = 0;
> >
> > if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> > accmode |= NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE;
> > @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> > /* Get inode */
> > err = fh_verify(rqstp, fhp, ftype, accmode);
> > if (err)
> > - goto out;
> > + return err;
> > if (get_write_count) {
> > host_err = fh_want_write(fhp);
> > if (host_err)
> > - return nfserrno(host_err);
> > + goto out_host_err;
> > }
> >
> > dentry = fhp->fh_dentry;
> > @@ -405,50 +373,59 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *fhp, struct iattr *iap,
> > iap->ia_valid &= ~ATTR_MODE;
> >
> > if (!iap->ia_valid)
> > - goto out;
> > + return 0;
> >
> > nfsd_sanitize_attrs(inode, iap);
> >
> > + if (check_guard && guardtime != inode->i_ctime.tv_sec)
> > + return nfserr_notsync;
> > +
> > /*
> > * The size case is special, it changes the file in addition to the
> > - * attributes.
> > + * attributes, and file systems don't expect it to be mixed with
> > + * "random" attribute changes. We thus split out the size change
> > + * into a separate calo for vfs_truncate, and do the rest as a
> > + * a separate setattr call.
> > */
> > if (iap->ia_valid & ATTR_SIZE) {
> > - err = nfsd_get_write_access(rqstp, fhp, iap);
> > - if (err)
> > - goto out;
> > - size_change = 1;
> > + struct path path = {
> > + .mnt = fhp->fh_export->ex_path.mnt,
> > + .dentry = dentry,
> > + };
> > + bool implicit_mtime = false;
> >
> > /*
> > - * RFC5661, Section 18.30.4:
> > - * Changing the size of a file with SETATTR indirectly
> > - * changes the time_modify and change attributes.
> > - *
> > - * (and similar for the older RFCs)
> > + * vfs_truncate implicity updates the mtime IFF the file size
> > + * actually changes. Avoid the additional seattr call below if
> > + * the only other attribute that the client sends is the mtime.
> > */
> > - if (iap->ia_size != i_size_read(inode))
> > - iap->ia_valid |= ATTR_MTIME;
> > - }
> > + if (iap->ia_size != i_size_read(inode) &&
> > + ((iap->ia_valid & ~(ATTR_SIZE | ATTR_MTIME)) == 0))
> > + implicit_mtime = true;
> >
> > - iap->ia_valid |= ATTR_CTIME;
> > + host_err = vfs_truncate(&path, iap->ia_size);
> > + if (host_err)
> > + goto out_host_err;
> >
> > - if (check_guard && guardtime != inode->i_ctime.tv_sec) {
> > - err = nfserr_notsync;
> > - goto out_put_write_access;
> > + iap->ia_valid &= ~ATTR_SIZE;
> > + if (implicit_mtime)
> > + iap->ia_valid &= ~ATTR_MTIME;
> > + if (!iap->ia_valid)
> > + goto done;
> > }
> >
> > + iap->ia_valid |= ATTR_CTIME;
> > +
> > fh_lock(fhp);
> > host_err = notify_change(dentry, iap, NULL);
> > fh_unlock(fhp);
> > - err = nfserrno(host_err);
> > + if (host_err)
> > + goto out_host_err;
> >
> > -out_put_write_access:
> > - if (size_change)
> > - put_write_access(inode);
> > - if (!err)
> > - err = nfserrno(commit_metadata(fhp));
> > -out:
> > - return err;
> > +done:
> > + host_err = commit_metadata(fhp);
> > +out_host_err:
> > + return nfserrno(host_err);
> > }
> >
> > #if defined(CONFIG_NFSD_V4)
>
> Reviewed-by: Jeff Layton <jlayton@redhat.com>
OK, applying.--b.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] nfsd: special case truncates some more
[not found] <d6fd33d2-6323-435c-b085-27a7180c4368@email.android.com>
@ 2017-01-24 22:15 ` Bruce Fields
2017-01-25 8:18 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: Bruce Fields @ 2017-01-24 22:15 UTC (permalink / raw)
To: Mkrtchyan, Tigran; +Cc: Christoph Hellwig, trondmy, linux-nfs, jlayton
On Tue, Jan 24, 2017 at 11:11:49PM +0100, Mkrtchyan, Tigran wrote:
> <br>
> /*
> <br>
> * The size case is special, it changes the file in addition to the
> <br>
> - * attributes.
> <br>
> + * attributes, and file systems don't expect it to be mixed with
> <br>
> + * "random" attribute changes. We thus split out the size change
> <br>
> + * into a separate calo for vfs_truncate, and do the rest </p></blockquote></div></div></div><div dir="auto"><br></div><div dir="auto">I guess by **calo** you mean **call**.</div><div dir="auto"><br></div><div dir="auto">Tigran.</div><div dir="auto"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr">as a
Thanks, fixed.
Could you fix your mailer to do plain text, please?
--b.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] nfsd: special case truncates some more
2017-01-24 22:15 ` [PATCH v2] nfsd: special case truncates some more Bruce Fields
@ 2017-01-25 8:18 ` Christoph Hellwig
2017-01-25 19:25 ` Bruce Fields
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2017-01-25 8:18 UTC (permalink / raw)
To: Bruce Fields
Cc: Mkrtchyan, Tigran, Christoph Hellwig, trondmy, linux-nfs, jlayton
Btw, any rason this didn't go into your for-4.10 tree? Due to the
owner/group changes being the most affected attrs I would classify
it as a security issue.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2] nfsd: special case truncates some more
2017-01-25 8:18 ` Christoph Hellwig
@ 2017-01-25 19:25 ` Bruce Fields
0 siblings, 0 replies; 11+ messages in thread
From: Bruce Fields @ 2017-01-25 19:25 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Mkrtchyan, Tigran, trondmy, linux-nfs, jlayton
On Wed, Jan 25, 2017 at 09:18:04AM +0100, Christoph Hellwig wrote:
> Btw, any rason this didn't go into your for-4.10 tree? Due to the
> owner/group changes being the most affected attrs I would classify it
> as a security issue.
It only seems to affect a very new setup, and the patch wasn't totally
trivial, so I was inclined to wait and see, but... to be honest I don't
have strong feelings about it, and I respect your opinion--I can queue
it up for-4.10 wth a cc to stable.
(Not attempting to add a Fixes: line as this probably goes very far
back.)
--b.
^ permalink raw reply [flat|nested] 11+ messages in thread
* NFS server regression in v4.10-rc7
2017-01-24 8:22 [PATCH v2] nfsd: special case truncates some more Christoph Hellwig
2017-01-24 14:06 ` Jeff Layton
@ 2017-02-08 21:27 ` Chuck Lever
2017-02-08 21:39 ` Christoph Hellwig
1 sibling, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2017-02-08 21:27 UTC (permalink / raw)
To: Linux NFS Mailing List; +Cc: Christoph Hellwig
> Begin forwarded message:
>=20
> From: Christoph Hellwig <hch@lst.de>
> Subject: [PATCH v2] nfsd: special case truncates some more
> Date: January 24, 2017 at 3:22:41 AM EST
> To: bfields@fieldses.org, jlayton@poochiereds.net
> Cc: trondmy@primarydata.com, linux-nfs@vger.kernel.org
>=20
> Both the NFS protocols and the Linux VFS use a setattr operation with =
a
> bitmap of attributs to set to set various file attributes including =
the
> file size and the uid/gid.
>=20
> The Linux syscalls never mixe size updates with unrelated updates like
> the uid/gid, and some file systems like XFS and GFS2 rely on the fact
> that truncates might not update random other attributes, and many
> other file systems handle the case but do not update the different
> attributes in the same transaction. NFSD on the other hand passes
> the attributes it gets on the wire more or less directly through to
> the VFS, leading to updates the file systems don't expect. XFS at
> least has an assert on the allowed attributes, which cought an NFS
> client sets the size and group at the same time.
>=20
> To handles this issue properly this switches nfsd to call vfs_truncate
> for size changes, and then handling all other attributes through
> notify_change. As a side effect this also means less boilerplace
> code around the size change as we can now reuse the VFS code.
>=20
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>=20
> Changes since V1:
> - avoid an extra setattr just for a MTIME attribute on the wire
>=20
> fs/nfsd/vfs.c | 97 =
+++++++++++++++++++++++------------------------------------
> 1 file changed, 37 insertions(+), 60 deletions(-)
>=20
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 26c6fdb..e91107a 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -332,37 +332,6 @@ nfsd_sanitize_attrs(struct inode *inode, struct =
iattr *iap)
> }
> }
>=20
> -static __be32
> -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp,
> - struct iattr *iap)
> -{
> - struct inode *inode =3D d_inode(fhp->fh_dentry);
> - int host_err;
> -
> - if (iap->ia_size < inode->i_size) {
> - __be32 err;
> -
> - err =3D nfsd_permission(rqstp, fhp->fh_export, =
fhp->fh_dentry,
> - NFSD_MAY_TRUNC | =
NFSD_MAY_OWNER_OVERRIDE);
> - if (err)
> - return err;
> - }
> -
> - host_err =3D get_write_access(inode);
> - if (host_err)
> - goto out_nfserrno;
> -
> - host_err =3D locks_verify_truncate(inode, NULL, iap->ia_size);
> - if (host_err)
> - goto out_put_write_access;
> - return 0;
> -
> -out_put_write_access:
> - put_write_access(inode);
> -out_nfserrno:
> - return nfserrno(host_err);
> -}
> -
> /*
> * Set various file attributes. After this call fhp needs an fh_put.
> */
> @@ -377,7 +346,6 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh =
*fhp, struct iattr *iap,
> __be32 err;
> int host_err;
> bool get_write_count;
> - int size_change =3D 0;
>=20
> if (iap->ia_valid & (ATTR_ATIME | ATTR_MTIME | ATTR_SIZE))
> accmode |=3D NFSD_MAY_WRITE|NFSD_MAY_OWNER_OVERRIDE;
> @@ -390,11 +358,11 @@ nfsd_setattr(struct svc_rqst *rqstp, struct =
svc_fh *fhp, struct iattr *iap,
> /* Get inode */
> err =3D fh_verify(rqstp, fhp, ftype, accmode);
> if (err)
> - goto out;
> + return err;
> if (get_write_count) {
> host_err =3D fh_want_write(fhp);
> if (host_err)
> - return nfserrno(host_err);
> + goto out_host_err;
> }
>=20
> dentry =3D fhp->fh_dentry;
> @@ -405,50 +373,59 @@ nfsd_setattr(struct svc_rqst *rqstp, struct =
svc_fh *fhp, struct iattr *iap,
> iap->ia_valid &=3D ~ATTR_MODE;
>=20
> if (!iap->ia_valid)
> - goto out;
> + return 0;
>=20
> nfsd_sanitize_attrs(inode, iap);
>=20
> + if (check_guard && guardtime !=3D inode->i_ctime.tv_sec)
> + return nfserr_notsync;
> +
> /*
> * The size case is special, it changes the file in addition to =
the
> - * attributes.
> + * attributes, and file systems don't expect it to be mixed with
> + * "random" attribute changes. We thus split out the size =
change
> + * into a separate calo for vfs_truncate, and do the rest as a
> + * a separate setattr call.
> */
> if (iap->ia_valid & ATTR_SIZE) {
> - err =3D nfsd_get_write_access(rqstp, fhp, iap);
> - if (err)
> - goto out;
> - size_change =3D 1;
> + struct path path =3D {
> + .mnt =3D fhp->fh_export->ex_path.mnt,
> + .dentry =3D dentry,
> + };
> + bool implicit_mtime =3D false;
>=20
> /*
> - * RFC5661, Section 18.30.4:
> - * Changing the size of a file with SETATTR indirectly
> - * changes the time_modify and change attributes.
> - *
> - * (and similar for the older RFCs)
> + * vfs_truncate implicity updates the mtime IFF the file =
size
> + * actually changes. Avoid the additional seattr call =
below if
> + * the only other attribute that the client sends is the =
mtime.
> */
> - if (iap->ia_size !=3D i_size_read(inode))
> - iap->ia_valid |=3D ATTR_MTIME;
> - }
> + if (iap->ia_size !=3D i_size_read(inode) &&
> + ((iap->ia_valid & ~(ATTR_SIZE | ATTR_MTIME)) =3D=3D =
0))
> + implicit_mtime =3D true;
>=20
> - iap->ia_valid |=3D ATTR_CTIME;
> + host_err =3D vfs_truncate(&path, iap->ia_size);
> + if (host_err)
> + goto out_host_err;
>=20
> - if (check_guard && guardtime !=3D inode->i_ctime.tv_sec) {
> - err =3D nfserr_notsync;
> - goto out_put_write_access;
> + iap->ia_valid &=3D ~ATTR_SIZE;
> + if (implicit_mtime)
> + iap->ia_valid &=3D ~ATTR_MTIME;
> + if (!iap->ia_valid)
> + goto done;
> }
>=20
> + iap->ia_valid |=3D ATTR_CTIME;
> +
> fh_lock(fhp);
> host_err =3D notify_change(dentry, iap, NULL);
> fh_unlock(fhp);
> - err =3D nfserrno(host_err);
> + if (host_err)
> + goto out_host_err;
>=20
> -out_put_write_access:
> - if (size_change)
> - put_write_access(inode);
> - if (!err)
> - err =3D nfserrno(commit_metadata(fhp));
> -out:
> - return err;
> +done:
> + host_err =3D commit_metadata(fhp);
> +out_host_err:
> + return nfserrno(host_err);
> }
>=20
> #if defined(CONFIG_NFSD_V4)
> --=20
> 2.1.4
After installing v4.10-rc7 on my test NFS server, I'm seeing a failure
with the git regression test suite.
*** t1050-large.sh ***
ok 1 - setup
ok 2 - add a large file or two
ok 3 - checkout a large file
not ok 4 - packsize limit
# =20
# test_create_repo mid &&
# (
# cd mid &&
# git config core.bigfilethreshold 64k &&
# git config pack.packsizelimit 256k &&
# =20
# # mid1 and mid2 will fit within 256k limit but
# # appending mid3 will bust the limit and will
# # result in a separate packfile.
# test-genrandom "a" $(( 66 * 1024 )) >mid1 &&
# test-genrandom "b" $(( 80 * 1024 )) >mid2 &&
# test-genrandom "c" $(( 128 * 1024 )) >mid3 &&
# git add mid1 mid2 mid3 &&
# =20
# count=3D0
# for pi in .git/objects/pack/pack-*.idx
# do
# test -f "$pi" && count=3D$(( $count + 1 ))
# done &&
# test $count =3D 2 &&
# =20
# (
# git hash-object --stdin <mid1
# git hash-object --stdin <mid2
# git hash-object --stdin <mid3
# ) |
# sort >expect &&
# =20
# for pi in .git/objects/pack/pack-*.idx
# do
# git show-index <"$pi"
# done |
# sed -e "s/^[0-9]* \([0-9a-f]*\) .*/\1/" |
# sort >actual &&
# =20
# test_cmp expect actual
# )
# =20
ok 5 - diff --raw
ok 6 - diff --stat
ok 7 - diff
Confirmed with NFSv3 / TCP and NFSv4.0 / TCP. Failure is 100%
reproducible.
If I revert the above commit on the server, the git regression test
suite passes.
The test is run like this: "cd /mnt/server/git; make clean; make; make =
test"
Can anyone else reproduce this failure?
--
Chuck Lever
chucklever@gmail.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: NFS server regression in v4.10-rc7
2017-02-08 21:27 ` NFS server regression in v4.10-rc7 Chuck Lever
@ 2017-02-08 21:39 ` Christoph Hellwig
2017-02-08 21:54 ` Chuck Lever
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2017-02-08 21:39 UTC (permalink / raw)
To: Chuck Lever; +Cc: Linux NFS Mailing List, Christoph Hellwig
On Wed, Feb 08, 2017 at 04:27:30PM -0500, Chuck Lever wrote:
> The test is run like this: "cd /mnt/server/git; make clean; make; make test"
>
> Can anyone else reproduce this failure?
What local file system do you use on the server?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: NFS server regression in v4.10-rc7
2017-02-08 21:39 ` Christoph Hellwig
@ 2017-02-08 21:54 ` Chuck Lever
2017-02-09 13:21 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: Chuck Lever @ 2017-02-08 21:54 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Linux NFS Mailing List
> On Feb 8, 2017, at 4:39 PM, Christoph Hellwig <hch@infradead.org> =
wrote:
>=20
> On Wed, Feb 08, 2017 at 04:27:30PM -0500, Chuck Lever wrote:
>> The test is run like this: "cd /mnt/server/git; make clean; make; =
make test"
>>=20
>> Can anyone else reproduce this failure?
>=20
> What local file system do you use on the server?
I see this with my tmpfs export.
--
Chuck Lever
chucklever@gmail.com
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: NFS server regression in v4.10-rc7
2017-02-08 21:54 ` Chuck Lever
@ 2017-02-09 13:21 ` Christoph Hellwig
2017-02-09 13:43 ` Christoph Hellwig
0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2017-02-09 13:21 UTC (permalink / raw)
To: Chuck Lever; +Cc: Christoph Hellwig, Linux NFS Mailing List
On Wed, Feb 08, 2017 at 04:54:02PM -0500, Chuck Lever wrote:
> I see this with my tmpfs export.
The tests passes fine for me with latests mainline on either tmpfs
or xfs. This is on a 64-bit x86 VM.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: NFS server regression in v4.10-rc7
2017-02-09 13:21 ` Christoph Hellwig
@ 2017-02-09 13:43 ` Christoph Hellwig
0 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2017-02-09 13:43 UTC (permalink / raw)
To: Chuck Lever; +Cc: Christoph Hellwig, Linux NFS Mailing List
On Thu, Feb 09, 2017 at 05:21:36AM -0800, Christoph Hellwig wrote:
> On Wed, Feb 08, 2017 at 04:54:02PM -0500, Chuck Lever wrote:
> > I see this with my tmpfs export.
>
> The tests passes fine for me with latests mainline on either tmpfs
> or xfs. This is on a 64-bit x86 VM.
But that was running as root, I've now reproduced it running as a
regular user.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-02-09 14:08 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-24 8:22 [PATCH v2] nfsd: special case truncates some more Christoph Hellwig
2017-01-24 14:06 ` Jeff Layton
2017-01-24 22:03 ` J. Bruce Fields
2017-02-08 21:27 ` NFS server regression in v4.10-rc7 Chuck Lever
2017-02-08 21:39 ` Christoph Hellwig
2017-02-08 21:54 ` Chuck Lever
2017-02-09 13:21 ` Christoph Hellwig
2017-02-09 13:43 ` Christoph Hellwig
[not found] <d6fd33d2-6323-435c-b085-27a7180c4368@email.android.com>
2017-01-24 22:15 ` [PATCH v2] nfsd: special case truncates some more Bruce Fields
2017-01-25 8:18 ` Christoph Hellwig
2017-01-25 19:25 ` Bruce Fields
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).