From: Al Viro <viro@ZenIV.linux.org.uk>
To: Tuomas Tynkkynen <tuomas@tuxera.com>
Cc: linux-fsdevel@vger.kernel.org,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Weird writev() behaviour on EFAULT - also successfully modifying the file
Date: Thu, 11 Aug 2016 23:10:53 +0100 [thread overview]
Message-ID: <20160811221053.GT2356@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20160811214651.74515950@duuni>
On Thu, Aug 11, 2016 at 09:46:51PM +0300, Tuomas Tynkkynen wrote:
> Greetings,
>
> I've noticed a corner case with writev() both modifying the file and
> returning -EFAULT at the same time. This happens on filesystems using
> generic_perform_write() (i.e. ext4, vfat) on 4.6.3 kernel and below,
> down to 3.16. Here's the reproducer:
[fill 8Kb with 'A', then reopen the file and do writev() on two-element
iovec array consisting of {NULL,0},{NULL,4096}]
> Running that prints "writev(): Bad address" but also some NUL bytes have
> appeared at the beginning file, in addition to the 'A's by the dd.
>
> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
> *
> 00001000 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |AAAAAAAAAAAAAAAA|
> *
> 00002000
>
> Is that intented behaviour?
There are two issues mixed here: one is that iovec array (NULL/0, NULL/4K)
we get
int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
{
if (!(i->type & (ITER_BVEC|ITER_KVEC))) {
char __user *buf = i->iov->iov_base + i->iov_offset;
bytes = min(bytes, i->iov->iov_len - i->iov_offset);
return fault_in_pages_readable(buf, bytes);
}
return 0;
}
succeed. It shouldn't. However, even if it correctly failed, we still have
another one - __copy_from_user_inatomic() zeroes what it hadn't copied over,
so if we e.g. gave write(2) a buffer with unmapped page in the middle, we
would get a short write *and* a bunch of bytes past the amount reported
written replaced with zeroes. And _that_ goes very far back. 2.1.late
definitely had already been that way.
Note, BTW, that fixing iov_iter_fault_in_readable() won't exclude a possibility
of write() returning -EFAULT and overwriting some bytes with zeroes; all you
need is munmap() racing with write() and hitting after the successful fault-in.
We could, in principle, provide a non-zeroing variants of copy_from_user and
copy_from_user_inatomic and switch ->write()/->write_iter() instances to
use of those; however, I'm not sure if it's really worth bothering with.
I do see one potential argument in favour of doing that; suppose we have
a buffer filled with 'A'. It starts one byte before a page boundary and all
its pages are currently in swap. There's a file also filled with 'A' and
mmapped by another process. We say write() from our buffer to that file.
Having that other process observe transient zeroes appearing in its mmapped
area is Not Nice(tm); having them stick around is even worse, and that can
happen in case if the first process gets killed while it's faulting the next
page of buffer in.
Linus?
next prev parent reply other threads:[~2016-08-11 22:10 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-11 18:46 Weird writev() behaviour on EFAULT - also successfully modifying the file Tuomas Tynkkynen
2016-08-11 22:10 ` Al Viro [this message]
2016-08-11 22:39 ` Linus Torvalds
2016-08-11 23:03 ` Al Viro
2016-08-11 23:10 ` Linus Torvalds
2016-08-11 23:13 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160811221053.GT2356@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tuomas@tuxera.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).