From: Keith Busch <keith.busch@intel.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfsprogs: Issue smaller discards at mkfs
Date: Thu, 26 Oct 2017 12:00:14 -0600 [thread overview]
Message-ID: <20171026180014.GA27102@localhost.localdomain> (raw)
In-Reply-To: <20171026162518.GW5483@magnolia>
On Thu, Oct 26, 2017 at 09:25:18AM -0700, Darrick J. Wong wrote:
> On Thu, Oct 26, 2017 at 08:41:31AM -0600, Keith Busch wrote:
> > Running mkfs.xfs was discarding the entire capacity in a single range. The
> > block layer would split these into potentially many smaller requests
> > and dispatch all of them to the device at roughly the same time.
> >
> > SSD capacities are getting so large that full capacity discards will
> > take some time to complete. When discards are deeply queued, the block
> > layer may trigger timeout handling and IO failure, though the device is
> > operating normally.
> >
> > This patch uses smaller discard ranges in a loop for mkfs to avoid
> > risking such timeouts. The max discard range is arbitrarilly set to
> > 128GB in this patch.
>
> I'd have thought devices would set sane blk_queue_max_discard_sectors
> so that the block layer doesn't send such a huge command that the kernel
> times out...
The block limit only specifies the maximum size of a *single* discard
request as seen by the end device. This single request is not a problem
for timeouts, as far as I know.
The timeouts occur when queueing many of them at the same time: the last
one in the queue will have very high latency compared to ones ahead of
it in the queue if the device processes discards serially (many do).
There's no such limit to say the maximum outstanding number discard
requests that can be dispatched at the same time; the max number of
dispatched commands are shard with read and write.
> ...but then I actually went and grepped that in the kernel and
> discovered that nbd, zram, raid0, mtd, and nvme all pass in UINT_MAX,
> which is 2T. Frighteningly xen-blkfront passes in get_capacity() (which
> overflows the unsigned int parameter on big virtual disks, I guess?).
The block layer will limit a single discard range to 4GB. If the IOCTL
specifies a larger range, the block layer will split it into multiple
requests.
> (I still think this is the kernel's problem, not userspace's, but now
> with an extra layer of OMGWTF sprayed on.)
>
> I dunno. What kind of device produces these timeouts, and does it go
> away if max_discards is lowered?
We observe timeouts on capacities above 4TB: a single BLKDISCARD for that
capacity has the block layer send 1024 discard requests at 4GB each.
Could the drivers do something else to mitigate this? We could set queue
depths lower to throttle dispatched commands and the problem would go
away. The depth, though, is set to optimize read and write, and we don't
want to harm that path to mitigate discard latency spikes.
prev parent reply other threads:[~2017-10-26 17:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-26 14:41 [PATCH] xfsprogs: Issue smaller discards at mkfs Keith Busch
2017-10-26 16:25 ` Darrick J. Wong
2017-10-26 17:49 ` Eric Sandeen
2017-10-26 18:01 ` Eric Sandeen
2017-10-26 18:32 ` Keith Busch
2017-10-26 19:59 ` Darrick J. Wong
2017-10-26 21:24 ` Keith Busch
2017-10-26 22:24 ` Dave Chinner
2017-10-26 23:09 ` Keith Busch
2017-10-26 18:00 ` Keith Busch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171026180014.GA27102@localhost.localdomain \
--to=keith.busch@intel.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox