From: Ric Wheeler <rwheeler@redhat.com>
To: "Ted Ts'o" <tytso@mit.edu>, Zheng Liu <gnehzuil.liu@gmail.com>,
linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-ext4@vger.kernel.org, Zheng Liu <wenqing.lz@taobao.com>
Subject: Re: [RFC][PATCH 0/3] add FALLOC_FL_NO_HIDE_STALE flag in fallocate
Date: Tue, 17 Apr 2012 14:52:15 -0400 [thread overview]
Message-ID: <4F8DBBDF.6010803@redhat.com> (raw)
In-Reply-To: <20120417184306.GA5916@thunk.org>
On 04/17/2012 02:43 PM, Ted Ts'o wrote:
> On Tue, Apr 17, 2012 at 01:59:37PM -0400, Ric Wheeler wrote:
>> You could get both security and avoid the run time hit by fully
>> writing the file or by having a variation that relied on "discard"
>> (i.e., no need to zero data if we can discard or track it as
>> unwritten).
> It's certainly the case that if the device supports persistent
> discard, something which we definitely *should* do is to send the
> discard at fallocate time and then mark the space as initialized.
This should be all advertised in /sys/block/sda - definitely worth encouraging
this for devices. I think that the device mapper "thin" target also supports
discard so you could get this behaviour with all devices if needed.
>
> Unfortunately, not all devices, and in particular no HDD's for which I
> aware support persistent discard. And, writing all zero's to the file
> is in fact what a number of programs for which I am aware (including
> an enterprise database) are doing, precisely because they tend to
> write into the fallocated space in a somewhat random order, and the
> extent conversion costs is in fact quite significant. But writing all
> zero's to the file before you can use it is quite costly; at the very
> least it burns disk bandwidth --- one of the main motivations of
> fallocate was to avoid needing to do a "write all zero pass", and
> while it does solve the problem for some use cases (such as DVR's),
> it's not a complete solution.
We also have a WRITE_SAME (with default pattern of zero data) that has long been
used in SCSI to initialize data.
>
> Whether or not it is a security issue is debateable. If using the
> fallocate flag requires CAP_SYS_RAWIO, and the process has to
> explicitly ask for the privilege, a process with those privileges can
> directly access memory and I/O ports directly, via the ioperm(2) and
> iopl(2) system calls. So I think it's possible to be a bit nuanced
> over whether or not this is as horrible as you might think.
We are still papering over an issue that seems to not be a challenge for XFS.
>
> Ultimately, if there are application programmers who are really
> desperate for that the last bit of performance, they can always use
> FIBMAP/FIEMAP and then read/write directly to the block device. (And
> no, that's not a theoretical example.) I think it is a worthwhile
> goal to provide file system interfaces that allow a trusted process
> which has the appropriate security capabilities to do things in a
> safer way than that.
>
I would prefer to let the very few crazy application programmers who need this
do insane things instead of opening and exposing data to these applications.
Or have them use a different file system that does not have this same penalty
(or to the same degree).
Thanks!
Ric
next prev parent reply other threads:[~2012-04-17 18:52 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-17 16:53 [RFC][PATCH 0/3] add FALLOC_FL_NO_HIDE_STALE flag in fallocate Zheng Liu
2012-04-17 16:53 ` [RFC][PATCH 1/3] vfs: " Zheng Liu
2012-04-17 16:53 ` [RFC][PATCH 2/3] vfs: add security check for _NO_HIDE_STALE flag Zheng Liu
2012-04-17 16:53 ` [RFC][PATCH 3/3] ext4: add FALLOC_FL_NO_HIDE_STALE support Zheng Liu
2012-04-17 17:40 ` [RFC][PATCH 0/3] add FALLOC_FL_NO_HIDE_STALE flag in fallocate Eric Sandeen
2012-04-18 4:08 ` Zheng Liu
2012-04-18 7:48 ` Lukas Czerner
2012-04-18 12:03 ` Zheng Liu
2012-04-18 12:07 ` Lukas Czerner
2012-04-20 9:52 ` Zheng Liu
2012-04-18 4:59 ` Andreas Dilger
2012-04-18 8:19 ` Lukas Czerner
2012-04-18 12:48 ` Zheng Liu
2012-04-18 15:09 ` Andreas Dilger
2012-04-20 9:59 ` Zheng Liu
2012-04-18 11:38 ` Zheng Liu
2012-04-18 11:39 ` Lukas Czerner
2012-04-18 12:06 ` Zheng Liu
2012-04-18 14:57 ` Eric Sandeen
2012-04-17 17:59 ` Ric Wheeler
2012-04-17 18:43 ` Ted Ts'o
2012-04-17 18:52 ` Ric Wheeler [this message]
2012-04-17 18:53 ` Eric Sandeen
2012-04-17 19:04 ` Ted Ts'o
2012-04-18 3:02 ` Dave Chinner
2012-04-18 16:07 ` Ted Ts'o
2012-04-18 23:37 ` Dave Chinner
2012-04-18 8:04 ` Lukas Czerner
-- strict thread matches above, loose matches on Subject: below --
2012-04-23 1:55 Szabolcs Szakacsits
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F8DBBDF.6010803@redhat.com \
--to=rwheeler@redhat.com \
--cc=gnehzuil.liu@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=wenqing.lz@taobao.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.