linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ric Wheeler <ricwheeler@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org,
	Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	lczerner@redhat.com
Subject: Re: Testing devices for discard support properly
Date: Tue, 7 May 2019 20:07:53 -0400	[thread overview]
Message-ID: <a409b3d1-960b-84a4-1b8d-1822c305ea18@gmail.com> (raw)
In-Reply-To: <20190507220449.GP1454@dread.disaster.area>

On 5/7/19 6:04 PM, Dave Chinner wrote:
> On Mon, May 06, 2019 at 04:56:44PM -0400, Ric Wheeler wrote:
>> (repost without the html spam, sorry!)
>>
>> Last week at LSF/MM, I suggested we can provide a tool or test suite to test
>> discard performance.
>>
>> Put in the most positive light, it will be useful for drive vendors to use
>> to qualify their offerings before sending them out to the world. For
>> customers that care, they can use the same set of tests to help during
>> selection to weed out any real issues.
>>
>> Also, community users can run the same tools of course and share the
>> results.
> My big question here is this:
>
> - is "discard" even relevant for future devices?


Hard to tell - current devices vary greatly.

Keep in mind that discard (or the interfaces you mention below) are not specific 
to SSD devices on flash alone, they are also useful for letting us free up space 
on software block devices. For example, iSCSI targets backed by a file, dm thin 
devices, virtual machines backed by files on the host, etc.

>
> i.e. before we start saying "we want discard to not suck", perhaps
> we should list all the specific uses we ahve for discard, what we
> expect to occur, and whether we have better interfaces than
> "discard" to acheive that thing.
>
> Indeed, we have fallocate() on block devices now, which means we
> have a well defined block device space management API for clearing
> and removing allocated block device space. i.e.:
>
> 	FALLOC_FL_ZERO_RANGE: Future reads from the range must
> 	return zero and future writes to the range must not return
> 	ENOSPC. (i.e. must remain allocated space, can physically
> 	write zeros to acheive this)
>
> 	FALLOC_FL_PUNCH_HOLE: Free the backing store and guarantee
> 	future reads from the range return zeroes. Future writes to
> 	the range may return ENOSPC. This operation fails if the
> 	underlying device cannot do this operation without
> 	physically writing zeroes.
>
> 	FALLOC_FL_PUNCH_HOLE | FALLOC_FL_NO_HIDE_STALE: run a
> 	discard on the range and provide no guarantees about the
> 	result. It may or may not do anything, and a subsequent read
> 	could return anything at all.
>
> IMO, trying to "optimise discard" is completely the wrong direction
> to take. We should be getting rid of "discard" and it's interfaces
> operations - deprecate the ioctls, fix all other kernel callers of
> blkdev_issue_discard() to call blkdev_fallocate() and ensure that
> drive vendors understand that they need to make FALLOC_FL_ZERO_RANGE
> and FALLOC_FL_PUNCH_HOLE work, and that FALLOC_FL_PUNCH_HOLE |
> FALLOC_FL_NO_HIDE_STALE is deprecated (like discard) and will be
> going away.
>
> So, can we just deprecate blkdev_issue_discard and all the
> interfaces that lead to it as a first step?


In this case, I think you would lose a couple of things:

* informing the block device on truncate or unlink that the space was freed up 
(or we simply hide that under there some way but then what does this really 
change?). Wouldn't this be the most common source for informing devices of freed 
space?

* the various SCSI/ATA commands are hints - the target device can ignore them - 
so we still need to be able to do clean up passes with something like fstrim I 
think occasionally.

Regards,

Ric


>
> Cheers,
>
> Dave.



  reply	other threads:[~2019-05-08  0:07 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-06 20:56 Testing devices for discard support properly Ric Wheeler
2019-05-07  7:10 ` Lukas Czerner
2019-05-07  8:48   ` Jan Tulak
2019-05-07  9:40     ` Lukas Czerner
2019-05-07 12:57       ` Ric Wheeler
2019-05-07 15:35         ` Bryan Gurney
2019-05-07 15:44           ` Ric Wheeler
2019-05-07 20:09             ` Bryan Gurney
2019-05-07 21:24               ` Chris Mason
2019-06-03 20:01                 ` Ric Wheeler
2019-05-07  8:21 ` Nikolay Borisov
2019-05-07 22:04 ` Dave Chinner
2019-05-08  0:07   ` Ric Wheeler [this message]
2019-05-08  1:14     ` Dave Chinner
2019-05-08 15:05       ` Ric Wheeler
2019-05-08 17:03         ` Martin K. Petersen
2019-05-08 17:09           ` Ric Wheeler
2019-05-08 17:25             ` Martin K. Petersen
2019-05-08 18:12               ` Ric Wheeler
2019-05-09 16:02                 ` Bryan Gurney
2019-05-09 17:27                   ` Ric Wheeler
2019-05-09 20:35                     ` Bryan Gurney
2019-05-08 21:58             ` Dave Chinner
2019-05-09  2:29               ` Martin K. Petersen
2019-05-09  3:20                 ` Dave Chinner
2019-05-09  4:35                   ` Martin K. Petersen
2019-05-08 16:16   ` Martin K. Petersen
2019-05-08 22:31     ` Dave Chinner
2019-05-09  3:55       ` Martin K. Petersen
2019-05-09 13:40         ` Ric Wheeler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a409b3d1-960b-84a4-1b8d-1822c305ea18@gmail.com \
    --to=ricwheeler@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=lczerner@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).