From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: virtio-fs@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Virtio-fs] [PATCH] virtiofsd: Fix data corruption with O_APPEND wirte in writeback mode
Date: Fri, 1 Nov 2019 08:08:50 +0000 [thread overview]
Message-ID: <20191101080850.GA2432@work-vm> (raw)
In-Reply-To: <20191031154732.GC7308@redhat.com>
* Vivek Goyal (vgoyal@redhat.com) wrote:
> On Wed, Oct 23, 2019 at 09:25:23PM +0900, Misono Tomohiro wrote:
> > When writeback mode is enabled (-o writeback), O_APPEND handling is
> > done in kernel. Therefore virtiofsd clears O_APPEND flag when open.
> > Otherwise O_APPEND flag takes precedence over pwrite() and write
> > data may corrupt.
> >
> > Currently clearing O_APPEND flag is done in lo_open(), but we also
> > need the same operation in lo_create(). So, factor out the flag
> > update operation in lo_open() to update_open_flags() and call it
> > in both lo_open() and lo_create().
> >
> > This fixes the failure of xfstest generic/069 in writeback mode
> > (which tests O_APPEND write data integrity).
> >
> > Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
>
> Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Thanks, merged.
Dave
> Thanks
> Vivek
>
> > ---
> > contrib/virtiofsd/passthrough_ll.c | 56 +++++++++++++++---------------
> > 1 file changed, 28 insertions(+), 28 deletions(-)
> >
> > diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
> > index e8892c3c32..79fb78ecce 100644
> > --- a/contrib/virtiofsd/passthrough_ll.c
> > +++ b/contrib/virtiofsd/passthrough_ll.c
> > @@ -1733,6 +1733,32 @@ static void lo_releasedir(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info
> > fuse_reply_err(req, 0);
> > }
> >
> > +static void update_open_flags(int writeback, struct fuse_file_info *fi)
> > +{
> > + /* With writeback cache, kernel may send read requests even
> > + when userspace opened write-only */
> > + if (writeback && (fi->flags & O_ACCMODE) == O_WRONLY) {
> > + fi->flags &= ~O_ACCMODE;
> > + fi->flags |= O_RDWR;
> > + }
> > +
> > + /* With writeback cache, O_APPEND is handled by the kernel.
> > + This breaks atomicity (since the file may change in the
> > + underlying filesystem, so that the kernel's idea of the
> > + end of the file isn't accurate anymore). In this example,
> > + we just accept that. A more rigorous filesystem may want
> > + to return an error here */
> > + if (writeback && (fi->flags & O_APPEND))
> > + fi->flags &= ~O_APPEND;
> > +
> > + /*
> > + * O_DIRECT in guest should not necessarily mean bypassing page
> > + * cache on host as well. If somebody needs that behavior, it
> > + * probably should be a configuration knob in daemon.
> > + */
> > + fi->flags &= ~O_DIRECT;
> > +}
> > +
> > static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> > mode_t mode, struct fuse_file_info *fi)
> > {
> > @@ -1760,12 +1786,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> > if (err)
> > goto out;
> >
> > - /*
> > - * O_DIRECT in guest should not necessarily mean bypassing page
> > - * cache on host as well. If somebody needs that behavior, it
> > - * probably should be a configuration knob in daemon.
> > - */
> > - fi->flags &= ~O_DIRECT;
> > + update_open_flags(lo->writeback, fi);
> >
> > fd = openat(parent_inode->fd, name,
> > (fi->flags | O_CREAT) & ~O_NOFOLLOW, mode);
> > @@ -1966,28 +1987,7 @@ static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
> >
> > fuse_log(FUSE_LOG_DEBUG, "lo_open(ino=%" PRIu64 ", flags=%d)\n", ino, fi->flags);
> >
> > - /* With writeback cache, kernel may send read requests even
> > - when userspace opened write-only */
> > - if (lo->writeback && (fi->flags & O_ACCMODE) == O_WRONLY) {
> > - fi->flags &= ~O_ACCMODE;
> > - fi->flags |= O_RDWR;
> > - }
> > -
> > - /* With writeback cache, O_APPEND is handled by the kernel.
> > - This breaks atomicity (since the file may change in the
> > - underlying filesystem, so that the kernel's idea of the
> > - end of the file isn't accurate anymore). In this example,
> > - we just accept that. A more rigorous filesystem may want
> > - to return an error here */
> > - if (lo->writeback && (fi->flags & O_APPEND))
> > - fi->flags &= ~O_APPEND;
> > -
> > - /*
> > - * O_DIRECT in guest should not necessarily mean bypassing page
> > - * cache on host as well. If somebody needs that behavior, it
> > - * probably should be a configuration knob in daemon.
> > - */
> > - fi->flags &= ~O_DIRECT;
> > + update_open_flags(lo->writeback, fi);
> >
> > sprintf(buf, "%i", lo_fd(req, ino));
> > fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
> > --
> > 2.21.0
> >
> > _______________________________________________
> > Virtio-fs mailing list
> > Virtio-fs@redhat.com
> > https://www.redhat.com/mailman/listinfo/virtio-fs
>
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
WARNING: multiple messages have this Message-ID (diff)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Vivek Goyal <vgoyal@redhat.com>
Cc: virtio-fs@redhat.com,
Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>,
qemu-devel@nongnu.org
Subject: Re: [Virtio-fs] [PATCH] virtiofsd: Fix data corruption with O_APPEND wirte in writeback mode
Date: Fri, 1 Nov 2019 08:08:50 +0000 [thread overview]
Message-ID: <20191101080850.GA2432@work-vm> (raw)
In-Reply-To: <20191031154732.GC7308@redhat.com>
* Vivek Goyal (vgoyal@redhat.com) wrote:
> On Wed, Oct 23, 2019 at 09:25:23PM +0900, Misono Tomohiro wrote:
> > When writeback mode is enabled (-o writeback), O_APPEND handling is
> > done in kernel. Therefore virtiofsd clears O_APPEND flag when open.
> > Otherwise O_APPEND flag takes precedence over pwrite() and write
> > data may corrupt.
> >
> > Currently clearing O_APPEND flag is done in lo_open(), but we also
> > need the same operation in lo_create(). So, factor out the flag
> > update operation in lo_open() to update_open_flags() and call it
> > in both lo_open() and lo_create().
> >
> > This fixes the failure of xfstest generic/069 in writeback mode
> > (which tests O_APPEND write data integrity).
> >
> > Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
>
> Reviewed-by: Vivek Goyal <vgoyal@redhat.com>
Thanks, merged.
Dave
> Thanks
> Vivek
>
> > ---
> > contrib/virtiofsd/passthrough_ll.c | 56 +++++++++++++++---------------
> > 1 file changed, 28 insertions(+), 28 deletions(-)
> >
> > diff --git a/contrib/virtiofsd/passthrough_ll.c b/contrib/virtiofsd/passthrough_ll.c
> > index e8892c3c32..79fb78ecce 100644
> > --- a/contrib/virtiofsd/passthrough_ll.c
> > +++ b/contrib/virtiofsd/passthrough_ll.c
> > @@ -1733,6 +1733,32 @@ static void lo_releasedir(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info
> > fuse_reply_err(req, 0);
> > }
> >
> > +static void update_open_flags(int writeback, struct fuse_file_info *fi)
> > +{
> > + /* With writeback cache, kernel may send read requests even
> > + when userspace opened write-only */
> > + if (writeback && (fi->flags & O_ACCMODE) == O_WRONLY) {
> > + fi->flags &= ~O_ACCMODE;
> > + fi->flags |= O_RDWR;
> > + }
> > +
> > + /* With writeback cache, O_APPEND is handled by the kernel.
> > + This breaks atomicity (since the file may change in the
> > + underlying filesystem, so that the kernel's idea of the
> > + end of the file isn't accurate anymore). In this example,
> > + we just accept that. A more rigorous filesystem may want
> > + to return an error here */
> > + if (writeback && (fi->flags & O_APPEND))
> > + fi->flags &= ~O_APPEND;
> > +
> > + /*
> > + * O_DIRECT in guest should not necessarily mean bypassing page
> > + * cache on host as well. If somebody needs that behavior, it
> > + * probably should be a configuration knob in daemon.
> > + */
> > + fi->flags &= ~O_DIRECT;
> > +}
> > +
> > static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> > mode_t mode, struct fuse_file_info *fi)
> > {
> > @@ -1760,12 +1786,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name,
> > if (err)
> > goto out;
> >
> > - /*
> > - * O_DIRECT in guest should not necessarily mean bypassing page
> > - * cache on host as well. If somebody needs that behavior, it
> > - * probably should be a configuration knob in daemon.
> > - */
> > - fi->flags &= ~O_DIRECT;
> > + update_open_flags(lo->writeback, fi);
> >
> > fd = openat(parent_inode->fd, name,
> > (fi->flags | O_CREAT) & ~O_NOFOLLOW, mode);
> > @@ -1966,28 +1987,7 @@ static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi)
> >
> > fuse_log(FUSE_LOG_DEBUG, "lo_open(ino=%" PRIu64 ", flags=%d)\n", ino, fi->flags);
> >
> > - /* With writeback cache, kernel may send read requests even
> > - when userspace opened write-only */
> > - if (lo->writeback && (fi->flags & O_ACCMODE) == O_WRONLY) {
> > - fi->flags &= ~O_ACCMODE;
> > - fi->flags |= O_RDWR;
> > - }
> > -
> > - /* With writeback cache, O_APPEND is handled by the kernel.
> > - This breaks atomicity (since the file may change in the
> > - underlying filesystem, so that the kernel's idea of the
> > - end of the file isn't accurate anymore). In this example,
> > - we just accept that. A more rigorous filesystem may want
> > - to return an error here */
> > - if (lo->writeback && (fi->flags & O_APPEND))
> > - fi->flags &= ~O_APPEND;
> > -
> > - /*
> > - * O_DIRECT in guest should not necessarily mean bypassing page
> > - * cache on host as well. If somebody needs that behavior, it
> > - * probably should be a configuration knob in daemon.
> > - */
> > - fi->flags &= ~O_DIRECT;
> > + update_open_flags(lo->writeback, fi);
> >
> > sprintf(buf, "%i", lo_fd(req, ino));
> > fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW);
> > --
> > 2.21.0
> >
> > _______________________________________________
> > Virtio-fs mailing list
> > Virtio-fs@redhat.com
> > https://www.redhat.com/mailman/listinfo/virtio-fs
>
> _______________________________________________
> Virtio-fs mailing list
> Virtio-fs@redhat.com
> https://www.redhat.com/mailman/listinfo/virtio-fs
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2019-11-01 8:08 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-23 12:25 [Virtio-fs] [PATCH] virtiofsd: Fix data corruption with O_APPEND wirte in writeback mode Misono Tomohiro
2019-10-23 12:25 ` Misono Tomohiro
2019-10-23 20:07 ` [Virtio-fs] " Vivek Goyal
2019-10-24 15:02 ` Vivek Goyal
2019-10-24 15:02 ` Vivek Goyal
2019-10-25 10:02 ` misono.tomohiro
2019-10-25 10:02 ` misono.tomohiro
2019-10-29 10:07 ` misono.tomohiro
2019-10-29 10:07 ` misono.tomohiro
2019-10-31 9:39 ` misono.tomohiro
2019-10-31 9:39 ` misono.tomohiro
2019-10-31 15:47 ` Vivek Goyal
2019-10-31 15:47 ` Vivek Goyal
2019-11-01 8:08 ` Dr. David Alan Gilbert [this message]
2019-11-01 8:08 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191101080850.GA2432@work-vm \
--to=dgilbert@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=vgoyal@redhat.com \
--cc=virtio-fs@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.