From: Jens Axboe <axboe@fb.com>
To: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jeff Moyer <jmoyer@redhat.com>, <linux-fsdevel@vger.kernel.org>,
<linux-block@vger.kernel.org>, <calvinowens@fb.com>, <hch@lst.de>,
<adilger@dilger.ca>
Subject: Re: [PATCH 0/11] Update version of write stream ID patchset
Date: Tue, 8 Mar 2016 14:56:31 -0700 [thread overview]
Message-ID: <56DF4A8F.8070103@fb.com> (raw)
In-Reply-To: <yq1r3fo4pob.fsf@sermon.lab.mkp.net>
On 03/05/2016 01:48 PM, Martin K. Petersen wrote:
>>>>>> "Jens" == Jens Axboe <axboe@fb.com> writes:
>
> Jens,
>
>>> OK. I'm still of the opinion that we should try to make this
>>> transparent. I could be swayed by workload descriptions and numbers
>>> comparing approaches, though.
>
> Jens> You can't just waive that flag and not have a solution. Any
> Jens> solution in that space would imply having policy in the kernel. A
> Jens> "just use a stream per file" is never going to work.
>
> I totally understand the desire to have explicit, long-lived
> "from-file-open to file-close" streams for things like database journals
> and whatnot.
That is an appealing use case.
> However, I think that you are dismissing the benefits of being able to
> group I/Os to disjoint LBA ranges within a brief period of time as
> belonging to a single file. It's something that we know works well on
> other types of storage. And it's also a much better heuristic for data
> placement on SSDs than just picking the next available bucket. It does
> require some pipelining on the drive but they will need some front end
> logic to handle the proposed stream ID separation in any case.
I'm not a huge fan of heuristics based exclusively around the temporal
and spacial locality. Using that as a hint for a case where no stream ID
(or write tag) is given would be an improvement, though. And perhaps
parts of the space should be reserved to just that.
But I don't think that should exclude doing this in a much more managed
fashion, personally I find that a lot saner than adding this sort of
state tracking in the kernel.
> Also, in our experiments we essentially got the explicit stream ID for
> free by virtue of the journal being written often enough that it was
> rarely if ever evicted as an active stream by the device. With no
> changes whatsoever to any application.
Journal would be an easy one to guess, for sure.
> My gripe with the current stuff is the same as before: The protocol is
> squarely aimed at papering over issues with current flash technology. It
> kinda-sorta works for other types of devices but it is very limiting. I
> appreciate that it is a great fit for the "handful of apps sharing a
> COTS NVMe drive on a cloud server" use case. But I think it is horrible
> for NVMe over Fabrics and pretty much everything else. That wouldn't be
> a big deal if the traditional storage models were going away. But I
> don't think they are...
I don't think erase blocks are going to go away in the near future.
We're going to have better media as well, that's a given, but cheaper
TLC flash is just going to make the current problem much worse. The
patchset is really about tagging the writes with a stream ID, nothing
else. That could potentially be any type of hinting, it's not exclusive
to being used with NVMe write directives at all.
--
Jens Axboe
next prev parent reply other threads:[~2016-03-08 21:56 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-04 16:10 [PATCH 0/11] Update version of write stream ID patchset Jens Axboe
2016-03-04 16:10 ` [PATCH 01/11] idr: make ida_simple_remove() return an error Jens Axboe
2016-03-04 16:10 ` [PATCH 02/11] block: add support for carrying a stream ID in a bio Jens Axboe
2016-03-04 16:10 ` [PATCH 03/11] Add support for per-file/inode stream ID Jens Axboe
[not found] ` <CAJVOszBXU-qQENcOGG8pWeARwoWL2G3gNJ0H2uNPjXkiVa8S+Q@mail.gmail.com>
2016-03-04 20:35 ` Jens Axboe
2016-03-04 16:10 ` [PATCH 04/11] Add system call for setting inode/file write " Jens Axboe
2016-03-04 16:10 ` [PATCH 05/11] wire up system call for x86/x86-64 Jens Axboe
2016-03-04 16:10 ` [PATCH 06/11] Add support for bdi tracking of stream ID Jens Axboe
2016-03-04 16:10 ` [PATCH 07/11] direct-io: add support for write stream IDs Jens Axboe
2016-03-04 16:10 ` [PATCH 08/11] Add stream ID support for buffered mpage/__block_write_full_page() Jens Axboe
2016-03-04 16:10 ` [PATCH 09/11] btrfs: add support for write stream IDs Jens Axboe
2016-03-04 20:44 ` Chris Mason
2016-03-04 20:45 ` Jens Axboe
2016-03-04 16:10 ` [PATCH 10/11] xfs: add support for buffered writeback stream ID Jens Axboe
2016-03-04 16:10 ` [PATCH 11/11] ext4: add support for write stream IDs Jens Axboe
2016-03-04 19:42 ` [PATCH 0/11] Update version of write stream ID patchset Jeff Moyer
2016-03-04 20:34 ` Jens Axboe
2016-03-04 21:01 ` Jeff Moyer
2016-03-04 21:06 ` Jens Axboe
2016-03-04 22:03 ` Jeff Moyer
2016-03-04 22:13 ` Jens Axboe
2016-03-05 20:48 ` Martin K. Petersen
2016-03-08 21:56 ` Jens Axboe [this message]
2016-03-17 23:43 ` Dan Williams
2016-03-18 0:18 ` Jens Axboe
2016-03-18 2:39 ` Martin K. Petersen
2016-03-18 17:37 ` Jens Axboe
2016-03-18 17:56 ` Dan Williams
2016-03-06 6:13 ` Andreas Dilger
2016-03-06 13:03 ` Martin K. Petersen
2016-03-06 16:08 ` Boaz Harrosh
2016-03-06 20:51 ` Shaun Tancheff
2016-03-07 15:41 ` Martin K. Petersen
2016-03-07 15:34 ` Martin K. Petersen
2016-03-06 22:42 ` Andreas Dilger
2016-03-07 15:52 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56DF4A8F.8070103@fb.com \
--to=axboe@fb.com \
--cc=adilger@dilger.ca \
--cc=calvinowens@fb.com \
--cc=hch@lst.de \
--cc=jmoyer@redhat.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.