From: Dave Chinner <david@fromorbit.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Dave Jones <davej@codemonkey.org.uk>,
"Darrick J. Wong" <darrick.wong@oracle.com>,
Linux Kernel <linux-kernel@vger.kernel.org>,
linux-xfs@vger.kernel.org
Subject: Re: iov_iter_pipe warning.
Date: Tue, 12 Sep 2017 16:02:14 +1000 [thread overview]
Message-ID: <20170912060214.GS17782@dastard> (raw)
In-Reply-To: <20170911200713.GK5426@ZenIV.linux.org.uk>
On Mon, Sep 11, 2017 at 09:07:13PM +0100, Al Viro wrote:
> On Mon, Sep 11, 2017 at 04:44:40PM +1000, Dave Chinner wrote:
>
> > > iov_iter_get_pages() for pipe-backed destination does page allocation
> > > and inserts freshly allocated pages into pipe.
> >
> > Oh, it's hidden more layers down than the code implied I needed to
> > look.
> >
> > i.e. there's no obvious clue in the function names that there is
> > allocation happening in these paths (get_pipe_pages ->
> > __get_pipe_pages -> push_pipe -> page allocation). The function
> > names imply it's getting a reference to pages (like
> > (get_user_pages()) and the fact it does allocation is inconsistent
> > with it's naming. Worse, when push_pipe() fails to allocate pages,
> > the error __get_pipe_pages() returns is -EFAULT, which further hides
> > the fact push_pipe() does memory allocation that can fail....
> >
> > And then there's the companion interface that implies page
> > allocation: pipe_get_pages_alloc(). Which brings me back to there
> > being no obvious clue while reading the code from the top down that
> > pages are being allocated in push_pipe()....
> >
> > Comments and documentation for this code would help, but I can't
> > find any of that, either. Hence I assumed naming followed familiar
> > patterns and so mistook these interfaces being one that does page
> > allocation and the other for getting references to pre-existing
> > pages.....
>
> _NONE_ of those is a public interface - they are all static, to start
> with.
It still requires comments to explain *why* the code is doing what
it's doing. You wrote the code, the whole explaination of it is in
your head. I can't see any of that, so when I read the code I sit
there thinking "why the fuck is it doing this?" because there's no
explanations of the WTF? moments in the code...
> The whole damn point is to have normal ->read_iter() work for read-to-pipe
> without changes. That's why -EFAULT as error (rather than some other
> mechanism for saying that pipe is full), etc.
... like this one.
That needs a *fucking big comment* because it's not at all obvious
why ENOMEM conditions are being hidden with EFAULT.
Comments and documentation are not for the person who writes the
code - they are for the stupid morons like me that need all the
help they can get to understand complex code that does tricksy,
subtle, non-obvious shit to work correctly.
> Filesystem should *not* be changed to use that. At all. As far as it is
> concerned,
> copy_to_iter()
> copy_page_to_iter()
> iov_iter_get_pages()
> iov_iter_get_pages_alloc()
> iov_iter_advance()
> are black boxes.
The implementation may be a black box, but the operations the black
box is performing for the callers still needs to be explained.
> Note that one of the bugs there applies to normal read() as well - if you
> are reading from a hole in file into an array with a read-only page in
> the middle, you want a short read.
And there's another WTF? moment.....
How do we get a read only page in the middle of an array of pages
we've been told to write data into? And why isn't that a bug in the
code that supplied us with those pages?
> Ignoring return value from iov_iter_zero()
> is wrong for iovec-backed case as well as for pipes.
>
> Another one manages to work for iovec-backed case, albeit with rather odd
> resulting semantics. readv(2) is underspecified (to put it politely) enough
> for compliance, but it's still bloody strange. Namely, if you have a contiguous
> 50Mb chunk of file on disk and run into e.g. a failure to fault the destination
> pages in halfway through that extent, you act as if *nothing* in the damn thing
> had been read, nevermind that 25Mb had been actually already read and that had
> there been a discontinuity 5Mb prior, the first 20Mb would've been reported
> read just fine.
>
> Strictly speaking that behaviour doesn't violate POSIX.
This is direct IO. POSIX compliant behaviour went out the window
long ago..... :/
> It is, however,
> atrocious from the QoI standpoint, and for no good reason whatsoever.
> It's quite easy to do better, and doing so would've eliminated the problems
> in pipe-backed case as well (see below). In addition to that, I would
> consider teaching bio_iov_iter_get_pages() to take the maximal bio size
> as an explict argument. That would've killed the need of copying the
> iterator and calling iov_iter_advance() in iomap_dio_actor() at all.
> Anyway, the minimal candidate fix follows; it won't do anything about
> the WARN_ON() in there, seeing that those are deliberate "program is
> doing something bogus" things, but it should eliminate all crap with
> ->splice_read() misreporting the amount of data it has copied.
I'll run your updated patch through my testing, but seeing as I have
nothing that tests splice+direct IO I'm not going to be able to test
that right now. I have slightly more important things to that need
urgent attention than writing splice+DIO test cases....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2017-09-12 6:03 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-21 20:59 iov_iter_pipe warning Dave Jones
2017-04-05 22:02 ` Dave Jones
2017-04-10 19:28 ` Al Viro
2017-04-10 19:42 ` Dave Jones
2017-04-10 19:57 ` Al Viro
2017-04-10 23:48 ` Dave Jones
2017-04-11 0:22 ` Al Viro
2017-04-11 3:05 ` Dave Jones
2017-04-11 3:28 ` Al Viro
2017-04-11 20:53 ` Dave Jones
2017-04-11 21:12 ` Al Viro
2017-04-11 22:25 ` Dave Jones
2017-04-11 23:28 ` Al Viro
2017-04-11 23:34 ` Dave Jones
2017-04-11 23:48 ` Al Viro
2017-04-11 23:45 ` Dave Jones
2017-04-11 23:51 ` Al Viro
2017-04-11 23:56 ` Al Viro
2017-04-12 0:06 ` Dave Jones
2017-04-12 0:17 ` Al Viro
2017-04-12 0:58 ` Dave Jones
2017-04-12 1:15 ` Al Viro
2017-04-12 2:29 ` Dave Jones
2017-04-12 2:58 ` Al Viro
2017-04-12 14:35 ` Dave Jones
2017-04-12 15:26 ` Al Viro
2017-04-12 16:27 ` Dave Jones
2017-04-12 17:07 ` Al Viro
2017-04-12 19:03 ` Dave Jones
2017-04-21 17:54 ` Al Viro
2017-04-27 4:19 ` Dave Jones
2017-04-27 16:34 ` Dave Jones
2017-04-27 17:39 ` Al Viro
2017-04-28 15:29 ` Dave Jones
2017-04-28 16:43 ` Al Viro
2017-04-28 16:50 ` Dave Jones
2017-04-28 17:20 ` Al Viro
2017-04-28 18:25 ` Al Viro
2017-04-29 1:58 ` Dave Jones
2017-04-29 2:47 ` Al Viro
2017-04-29 15:51 ` Dave Jones
2017-04-29 20:46 ` [git pull] vfs.git fix (Re: iov_iter_pipe warning.) Al Viro
2017-08-07 20:18 ` iov_iter_pipe warning Dave Jones
2017-08-28 20:31 ` Dave Jones
2017-08-29 4:25 ` Darrick J. Wong
2017-08-30 17:05 ` Dave Jones
2017-08-30 17:13 ` Darrick J. Wong
2017-08-30 17:17 ` Dave Jones
2017-09-06 20:03 ` Dave Jones
2017-09-06 23:46 ` Dave Chinner
2017-09-07 3:48 ` Dave Jones
2017-09-07 4:33 ` Al Viro
2017-09-08 1:04 ` Al Viro
2017-09-10 1:07 ` Dave Jones
2017-09-10 2:57 ` Al Viro
2017-09-10 16:07 ` Dave Jones
2017-09-10 20:05 ` Al Viro
2017-09-10 20:07 ` Dave Jones
2017-09-10 20:33 ` Al Viro
2017-09-10 21:11 ` Dave Chinner
2017-09-10 21:19 ` Al Viro
2017-09-10 22:08 ` Dave Chinner
2017-09-10 23:07 ` Al Viro
2017-09-10 23:15 ` Al Viro
2017-09-11 0:31 ` Dave Chinner
2017-09-11 3:32 ` Al Viro
2017-09-11 6:44 ` Dave Chinner
2017-09-11 20:07 ` Al Viro
2017-09-11 20:17 ` Al Viro
2017-09-12 6:02 ` Dave Chinner [this message]
2017-09-12 11:13 ` Al Viro
2017-09-11 12:07 ` Christoph Hellwig
2017-09-11 12:51 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170912060214.GS17782@dastard \
--to=david@fromorbit.com \
--cc=darrick.wong@oracle.com \
--cc=davej@codemonkey.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=viro@ZenIV.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.