From: Theodore Ts'o <tytso@mit.edu>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-fsdevel@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
linux-ext4@vger.kernel.org
Subject: Re: [PATCH RFC] block: use discard if possible in blkdev_issue_discard()
Date: Tue, 18 Feb 2014 00:47:07 -0500 [thread overview]
Message-ID: <20140218054707.GI26580@thunk.org> (raw)
In-Reply-To: <yq1ob25nky8.fsf@sermon.lab.mkp.net>
On Mon, Feb 17, 2014 at 10:44:47PM -0500, Martin K. Petersen wrote:
>
> Well, that has changed a bit with the logical block provisioning bits in
> SCSI. That's why I brought up the allocation/deallocation assumptions in
> the existing two blkdev_issue_foo() calls.
Is it a fair assumption that the reason why T10 added these bits is
mainly so that clients of thin-provisioned storage devices can
guarantee that a subseq uent write won't fail? Since historically the
spec writers have washed their hands of anything that might vaguely
smell of performance considerations....
> Yeah. So deprovision with guaranteed zero on read is what you're
> after. I'll chew on that a bit tomorrow.
Yes. And also some way for the host OS (or some other underlying
storage device, more generally) to send hints to the guest OS about
the best way to tune filesystems at mkfs and/or mount time for best
performance, so we don't have to require the system administrator to
have to manually set mount options, mkfs options, and/or magic "echo"
commands to /proc or /sys files.
It would be great if we could get SATA and SCSI devices to also
deliver these hints to kernel, or to have our kernels make some
hueristics based on various SCSI mode pages, and then deliver it to
the file system or via some defined /sys files so that userspace
programs like mkfs can automatically DTRT. I'm not sure if this is
going to require spec changes and hardware changes, or whether there
are some existing hints form the device drivers that we might be able
to leverage.
For example, right now I'm just manually using the discard mount
options on certain PCIe-attached flash where I know it's beneficial,
but it's a manual tuning based on knowledge of the underlying storage
device. Figuring out when it's better to use fstrim, or doing it at
unlink time, etc., is something that's better done automatically
instead of manually, but this is I fear answering questions like this
in a reliable fashion is going to be a very hard problem --- and as
storage devices get more complex, such as hybrid drives with even more
varied and interesting performance characteristics, it's only going to
get harder!
- Ted
next prev parent reply other threads:[~2014-02-18 5:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-14 4:32 [PATCH RFC] block: use discard if possible in blkdev_issue_discard() Theodore Ts'o
2014-02-14 13:05 ` Christoph Hellwig
2014-02-14 14:57 ` Theodore Ts'o
2014-02-14 17:14 ` Martin K. Petersen
2014-02-15 1:29 ` Theodore Ts'o
2014-02-17 16:44 ` Martin K. Petersen
2014-02-17 19:19 ` Theodore Ts'o
2014-02-18 1:31 ` Martin K. Petersen
2014-02-18 2:17 ` Theodore Ts'o
2014-02-18 3:44 ` Martin K. Petersen
2014-02-18 5:47 ` Theodore Ts'o [this message]
2014-02-19 2:20 ` Martin K. Petersen
2014-02-17 16:41 ` Lukáš Czerner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140218054707.GI26580@thunk.org \
--to=tytso@mit.edu \
--cc=axboe@kernel.dk \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).