All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <matthew@wil.cx>
To: Christoph Hellwig <hch@infradead.org>
Cc: James Bottomley <James.Bottomley@suse.de>,
	linux-scsi@vger.kernel.org, linux-ide@vger.kernel.org,
	linux-kernel@vger.kernel.org, liml@rtr.ca, jens.axboe@oracle.com,
	dwmw2@infradead.org
Subject: Re: [PATCH 0/7] discard support revisited
Date: Wed, 2 Sep 2009 13:46:58 -0600	[thread overview]
Message-ID: <20090902194658.GQ22870@parisc-linux.org> (raw)
In-Reply-To: <20090830224829.GA22682@infradead.org>

On Sun, Aug 30, 2009 at 06:48:29PM -0400, Christoph Hellwig wrote:
> As I've recently worked on all sides of the discard battle (filesystem
> support, initiator support, and target support) here are my notes:
> 
>  - WRITE_SAME is extremly nice to implement for both the initiator and
>    target.  It has the LBA and len exactly in the same place as normal
>    16 byte commands, the payload length is fixed to one block, which
>    we can allocate once and zero so that we don't even need any memory
>    allocations for this command in the initiator.
>  - UNMAP is a pain to implement in both initiator and target.  Not
>    actuall having the LBA/len information in the cdb but in the payload
>    is at least a minor incovenience in the initator, and quite annoying
>    in the target as we now need to process payload data in the fastpath,
>    which we otherwise only do for slow path CDBs.  This will be
>    especially bad for split kernel/user target implementations.
> 
> Now the weird design of UNMAP of course has a rather (besides some
> apparent pissing contest at NetApp about who can't come with the worst
> possible protocol specifications, whose results can be seen in NFSv4
> and iSer), and that is that it allows dicarding of multiple
> discontinguous ranges.

This sentence no object ;-)

> Doing so is really bad for the filesystem as
> it requires it to track multiple outstanding discard requests, which
> requires locking, and book keeping to make sure we do not re-use these
> blocks before they are discarded.

Yeah, but we need to do that for TRIM anyway.  While we're doing it for
TRIM, we might as well do it for UNMAP.

> And at least for my target design it does not provide any measureable
> benefits at all, the discard operations are mapped to a hole punch
> ioctl on a filesystem, which has a constant basic overhead for each
> region punched (synchronous transaction commit) and a small linear
> cost per extent removed.  The only benefit of the multiple rangs unmap
> would be a saving of protocol roundtrips.

Sure, but you've got a relatively sane underpinning.  NAND is pretty
insane, and the aggregation can actually go a long way to helping with
some of the problems.

> Now that is interestingly actually a downside at least for my still
> rather dumb target implementation with a typical Linux filesystem
> workload on the initiator side.  If we actually do a lot different unmap
> operations in a single unmap command it can start to take significant
> amounts of time, and do to Linux waiting for queue drains frequently
> due to the barrier implementations we will end up waiting for the unmap
> command.

OK, but that's because you've implemented a single-range ioctl.  If we
had an ioctl which let you discard multiple ranges, it would actually
be faster (due to the barriers) than implementing a WRITE SAME.

-- 
Matthew Wilcox				Intel Open Source Technology Centre
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."

      reply	other threads:[~2009-09-02 19:46 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-29 23:03 [PATCH 0/7] discard support revisited Christoph Hellwig
2009-08-29 23:03 ` [PATCH 1/7] Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads Christoph Hellwig
2009-09-03 17:52   ` David Woodhouse
2009-09-03 17:56     ` Matthew Wilcox
2009-08-29 23:03 ` [PATCH 2/7] block: use blkdev_issue_discard in blk_ioctl_discard Christoph Hellwig
2009-09-01  9:06   ` Steven Whitehouse
2009-09-11 22:26   ` Christoph Hellwig
2009-09-14  9:40     ` Jens Axboe
2009-08-29 23:03 ` [PATCH 3/7] block: discard may need to allocate pages Christoph Hellwig
2009-08-29 23:03 ` [PATCH 4/7] sd: add support for WRITE SAME (16) with unmap bit Christoph Hellwig
2009-08-30  0:43   ` Douglas Gilbert
2009-08-30  1:05     ` Christoph Hellwig
2009-08-30  2:43       ` Douglas Gilbert
2009-08-30  2:48         ` Christoph Hellwig
2009-08-30 11:12   ` Sergei Shtylyov
2009-08-30 17:14     ` Christoph Hellwig
2009-08-29 23:03 ` [PATCH 5/7] libata: Add support for TRIM Christoph Hellwig
2009-08-29 23:03 ` [PATCH 6/7] block: allow large discard requests Christoph Hellwig
2009-08-30  2:49   ` Mark Lord
2009-08-30  2:50     ` Matthew Wilcox
2009-08-30  2:52       ` Mark Lord
2009-08-30  2:56         ` Christoph Hellwig
2009-08-29 23:03 ` [PATCH 7/7] xfs: add batches discard support Christoph Hellwig
2009-08-29 23:37 ` [PATCH 0/7] discard support revisited Matthew Wilcox
2009-08-30  2:15   ` Christoph Hellwig
2009-08-30  3:03     ` Matthew Wilcox
2009-08-30 20:17     ` James Bottomley
2009-08-30 21:42       ` Matthew Wilcox
2009-08-30 22:48       ` Christoph Hellwig
2009-09-02 19:46         ` Matthew Wilcox [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090902194658.GQ22870@parisc-linux.org \
    --to=matthew@wil.cx \
    --cc=James.Bottomley@suse.de \
    --cc=dwmw2@infradead.org \
    --cc=hch@infradead.org \
    --cc=jens.axboe@oracle.com \
    --cc=liml@rtr.ca \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.