From: Dave Chinner <david@fromorbit.com>
To: Olga Kornievskaia <olga.kornievskaia@gmail.com>
Cc: trond.myklebust@hammerspace.com,
Anna Schumaker <anna.schumaker@netapp.com>,
viro@zeniv.linux.org.uk, Steve French <smfrench@gmail.com>,
Miklos Szeredi <miklos@szeredi.hu>,
linux-nfs <linux-nfs@vger.kernel.org>,
linux-fsdevel@vger.kernel.org, linux-cifs@vger.kernel.org,
linux-unionfs@vger.kernel.org, linux-man@vger.kernel.org
Subject: Re: [PATCH v4 02/11] VFS: copy_file_range check validity of input source offset
Date: Thu, 1 Nov 2018 10:33:08 +1100 [thread overview]
Message-ID: <20181031233308.GR6311@dastard> (raw)
In-Reply-To: <CAN-5tyEHAUadnxuGmxC_KJzwUT7JO=-VSyioAQM0GTUR6+WY_Q@mail.gmail.com>
On Wed, Oct 31, 2018 at 10:51:48AM -0400, Olga Kornievskaia wrote:
> On Tue, Oct 30, 2018 at 8:15 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Tue, Oct 30, 2018 at 05:10:58PM -0400, Olga Kornievskaia wrote:
> > > On Tue, Oct 30, 2018 at 5:03 AM Dave Chinner <david@fromorbit.com> wrote:
> > > >
> > > > On Mon, Oct 29, 2018 at 10:41:22AM -0400, Olga Kornievskaia wrote:
> > > > > On Sat, Oct 27, 2018 at 5:27 AM Dave Chinner <david@fromorbit.com> wrote:
> > > > > >
> > > > > > On Fri, Oct 26, 2018 at 04:10:48PM -0400, Olga Kornievskaia wrote:
> > > > > > > From: Olga Kornievskaia <kolga@netapp.com>
> > > > > > >
> > > > > > > Input source offset can't be beyond the end of the file.
> > > > > > >
> > > > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > > > ---
> > > > > > > fs/read_write.c | 3 +++
> > > > > > > 1 file changed, 3 insertions(+)
> > > > > > >
> > > > > > > diff --git a/fs/read_write.c b/fs/read_write.c
> > > > > > > index fb4ffca..b3b304e 100644
> > > > > > > --- a/fs/read_write.c
> > > > > > > +++ b/fs/read_write.c
> > > > > > > @@ -1594,6 +1594,9 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in,
> > > > > > > }
> > > > > > > }
> > > > > > >
> > > > > > > + if (pos_in >= i_size_read(inode_in))
> > > > > > > + return -EINVAL;
> > > > > > > +
> > > > > >
> > > > > > vfs_copy_file_range seems ot be missing a wide range of checks.
> > > > > > rlimit, s_maxbytes, LFS file sizes, etc. This is a write, so all the
> > > > > > checks in generic_write_checks() apply, right? And the same security
> > > > > > issues like stripping setuid bits, etc? And we need to touch
> > > > > > atime on the source file, too?
> > > > >
> > > > > Yes sound like needed checks.
> > > > >
> > > > > > We've just merged 5 or so patches in 4.19-rc8 and we're ready to
> > > > > > merge another ~30 patch series to fix all the stuff missing from the
> > > > > > clone/dedupe file range operations that make them safe and robust.
> > > > > > It seems like copy_file_range is all the checks it needs, too?
> > > > >
> > > > > Are you proposing to not do this check now in favor of the proper work
> > > > > that will do all of those checks you listed above?
> > > >
> > > > No, I'm saying that if you're adding one check, there's a whole heap
> > > > of checks that still need to be added, *especially* if this is going
> > > > to fall back to page cache copy between superblocks that may have
> > > > different limits and constraints.
> > > >
> > > > There's security issues in this API. They need to be fixed before we
> > > > allow it to do more and potentially expose more problems due to it's
> > > > wider capability.
> > >
> > > Before I totally give up on this feature, can you help me understand
> > > your concerns with allowing the generic copy_file_range via
> > > do_splice().
> >
> > it's not do_splice_direct() i'm concerned about. It's /writing data
> > without adequate checks/ that I'm concerned about.
> > ->copy_file_range() also writes data, so it needs to undergo the
> > same safety checks as well.
>
> Thank you Dave for clarifying and elaborating on the points. As you
> pointed out this concerns apply to the current code the same way as to
> the patch series. Those concerns should be address however I feel like
> they shouldn't be the responsibility of this particular patch series.
> Therefore, I ask for the community to either make any final comments
> for any changes that are needed to "version 7" patches and if no more
> comments arise I would like to ask for this to be added to the queue
> for the next kernel version.
>
> Then the next patch series would be just VFS and would add appropriate
> checks and then allow for the generic copy_file_range() via do_splice.
That's fine by me.
>
> > > I have mentioned I'm not a VFS expert thus I come from just looking at
> > > the available documentation and the code.
> > >
> > > I don't see any restrictions on the files being passed in the
> > > do_splice_direct(). There are no restrictions that they must be from
> > > the same filesystem or file system type. But perhaps this not the
> > > concern you had but more about checking validity of arguments?
> > >
> > > I have looked at Dave Wong's, if I'm not mistaken these 2 are the
> > > relevant patches:
> > > [PATCH 02/28] vfs: check file ranges before cloning files
> > > -- a couple but not all checks apply to copy_file_range() .
> >
> > Yes, of course - clone/dedupe have different constraints, but the
> > core checks are still needed for copy_file_range().
> >
> > For example, the man page says:
> >
> > EINVAL
> > Requested range extends beyond the end of the source
> > file; or the flags argument is not 0.
> >
> > Your patch above doesn't actually check that - it only checks if the
> > pos_in is beyond EOF. It needs to check if pos_in + len is beyond
> > EOF. After checking for wraps, of course.
>
> There was a reason why I didn't include the "pos_in + len" check. It
> sparked the conversation why should "pos_in + len" be an error, when a
> "read" system call would just return a "short" read and EOF. So I
> dropped the check for "pst_in + len" to be an error.
So man page patches will be required, too. :)
Basically, we need to nail down the expected semantics, make sure
they are correctly documented and /enforced consistently/ across all
filesystems.
> > > -- these checks apply to the code once we fall back to the
> > > do_splice().
> >
> > man page says:
> >
> > EFBIG
> > An attempt was made to write a file that exceeds the
> > implementation-defined maximum file size or the process's
> > file size limit, or to write at a position past the maximum
> > allowed offset.
> >
> > These conditions apply to the destination file regards of the method
> > used to copy the data. That's what the generic methods now check for
> > clone/dedupe, and need to be used here, too.
>
> Agreed and once Darrek patches are in, copy_file_range() can use them too.
Should be in the next couple of days.
> > 7debbf015f58 xfs: update ctime and remove suid before cloning files
> >
> > Which then got moved into the generic remap_file_range code in
> > Darrick's "vfs: remap helper should update destination inode
> > metadata" patch:
> >
> > https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git/commit/?h=for-next&id=8dde90bca6fca3736ea20109654bcf6dcf2ecf1d
> >
> > We can't assume that a server side copy is going to strip setuid
> > bits or even update target files c/mtimes.
>
> I would like to discuss your concerns about updating attributes
> (c/m/atimes), why shouldn't it be a ->copy_file_range()
> responsibility. copy_file_rage is basically a read+write. As far as I
> can tell, vfs_read and vfs_write (in VFS) don't deal with updating
> attributes.
You're looking at the wrong level. The VFS layer is the first
multiplexing layer, allowing filesystems to select a method of
handling functionality. They then make use of "generic helpers"
to implement the required functionality, and they contain the
required updates.
ie.g. A list of generic helpers with atime update callers from my
cscope index:
f fs/pipe.c pipe_read 343 file_accessed(filp);
h fs/readdir.c iterate_dir 56 file_accessed(file);
i fs/splice.c generic_file_splice_read 311 file_accessed(in);
j fs/splice.c splice_direct_to_actor 992 file_accessed(in);
p mm/filemap.c generic_file_buffered_read 2299 file_accessed(filp);
q mm/filemap.c generic_file_read_iter 2339 file_accessed(file);
r mm/filemap.c generic_file_mmap 2736 file_accessed(file);
These are effectively reference implementations of the file reading
infrastructure. Filesystems often have customised implementations
but they all must contain the same functioanlity and behaviour as
the reference implementation.
> I'm guessing it's assumed that underlying file systems are
> going to take care of it (unless of course I misread the code).
Only the ones that don't specifically call the generic helper to do
the work.
IOWs, what I'd like to see is a generic_copy_file_range() as the
reference implemenation using a page cache copy. This contains all
the required checks, timestamp updates, etc. If the filesystem does
not supply ->copy_file_range, then generic_copy_file_range() is
called, not do_splice_direct(). Indeed, a filesystem should be able
to do:
.copy_file_range = xfs_copy_file_range,
xfs_copy_file_range(...)
{
trace_xfs_copy_file_range(...)
return generic_copy_file_range(....);
}
and have everything work correctly.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2018-10-31 23:33 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-26 20:10 [PATCH v4 00/11] client-side support for "inter" SSC copy Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 01/11] VFS: move cross device copy_file_range() check into filesystems Olga Kornievskaia
2018-10-26 21:23 ` Matthew Wilcox
2018-10-26 22:10 ` Steve French
2018-10-27 9:09 ` Dave Chinner
2018-10-29 14:31 ` Olga Kornievskaia
2018-10-27 11:11 ` Jeff Layton
2018-10-26 20:10 ` [PATCH 1/1] man-page: copy_file_range(2) allow for cross-device copies Olga Kornievskaia
2018-10-27 9:12 ` Dave Chinner
2018-10-27 13:23 ` Matthew Wilcox
2018-10-28 1:33 ` Dave Chinner
2018-10-28 2:39 ` Matthew Wilcox
2018-10-29 14:25 ` Olga Kornievskaia
2018-10-29 15:52 ` Olga Kornievskaia
2018-10-29 17:49 ` Amir Goldstein
2018-10-26 20:10 ` [PATCH v4 02/11] VFS: copy_file_range check validity of input source offset Olga Kornievskaia
2018-10-26 21:26 ` Matthew Wilcox
2018-10-29 16:09 ` Olga Kornievskaia
2018-10-27 9:27 ` Dave Chinner
2018-10-29 14:41 ` Olga Kornievskaia
2018-10-30 9:03 ` Dave Chinner
2018-10-30 13:40 ` Olga Kornievskaia
2018-10-30 23:40 ` Dave Chinner
2018-10-30 21:10 ` Olga Kornievskaia
2018-10-30 21:12 ` Olga Kornievskaia
2018-10-31 0:14 ` Dave Chinner
2018-10-31 14:51 ` Olga Kornievskaia
2018-10-31 23:33 ` Dave Chinner [this message]
2018-10-26 20:10 ` [PATCH v4 03/11] NFS: NFSD defining nl4_servers structure needed by both Olga Kornievskaia
2018-10-27 11:14 ` Jeff Layton
2018-10-29 14:28 ` Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 04/11] NFS: add COPY_NOTIFY operation Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 05/11] NFS: add ca_source_server<> to COPY Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 06/11] NFS: also send OFFLOAD_CANCEL to source server Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 07/11] NFS: inter ssc open Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 08/11] NFS: skip recovery of copy open on dest server Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 09/11] NFS: for "inter" copy treat ESTALE as ENOTSUPP Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 10/11] NFS: COPY handle ERR_OFFLOAD_DENIED Olga Kornievskaia
2018-10-26 20:10 ` [PATCH v4 11/11] NFS: replace cross device check in copy_file_range Olga Kornievskaia
2018-10-26 21:22 ` Matthew Wilcox
2018-10-27 11:08 ` Jeff Layton
2018-10-27 13:26 ` Matthew Wilcox
2018-10-29 14:28 ` Olga Kornievskaia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181031233308.GR6311@dastard \
--to=david@fromorbit.com \
--cc=anna.schumaker@netapp.com \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-man@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=olga.kornievskaia@gmail.com \
--cc=smfrench@gmail.com \
--cc=trond.myklebust@hammerspace.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).