public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: linux-raid@vger.kernel.org, Jes Sorensen <jes@trained-monkey.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>, Xiao Ni <xni@redhat.com>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Coly Li <colyli@suse.de>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Jonmichael Hands <jm@chia.net>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>
Subject: Re: [PATCH mdadm v2 0/2] Discard Option for Creating Arrays
Date: Mon, 12 Sep 2022 13:40:36 -0400	[thread overview]
Message-ID: <yq1fsgwbijv.fsf@ca-mkp.ca.oracle.com> (raw)
In-Reply-To: <20220908230847.5749-1-logang@deltatee.com> (Logan Gunthorpe's message of "Thu, 8 Sep 2022 17:08:45 -0600")


Hi Logan!

> When specified, mdadm will send block discard (aka. trim or
> deallocate) requests to all of the specified block devices. It will
> then read back parts of the device to double check that the disks are
> now all zeros. If they are all zero, the array is in a known state and
> does not need to generate the parity seeing everything is zero and
> correct.

Unfortunately that's a dangerous assertion. The drive is free to ignore
any or all parts of a discard request. And typically the results vary
depending on what else the drive has going on at the moment the request
was executed.  I.e. you could experience completely different results on
the same drive depending on whether it was busy garbage collecting or
doing other I/O when the various portions of a discard request were
processed.

> Another option for this work is to use a write zero request. This can
> be done in linux currently with fallocate and the FALLOC_FL_PUNCH_HOLE
> | FALLOC_FL_KEEP_SIZE flags. This will send optimized write-zero requests
> to the devices, without falling back to regular writes to zero the disk.
> The benefit of this is that the disk will explicitly read back as zeros,
> so a zero check is not necessary. The down side is that not all devices
> implement this in as optimal a way as the discard request does and on
> some of these devices zeroing can take multiple seconds per GB.

REQ_OP_WRITE_ZEROES was explicitly designed for this use case. It will
use discards if it is safe to do so. That is if the device supports
deterministic zeroing; either explicitly through the storage protocol or
through ATA quirks (thanks to the drive being vendor-qualified for RAID
usage).

> Because write-zero requests may be slow and most (but not all) discard
> requests read back as zeros, this work uses only discard requests.

REQ_OP_WRITE_ZEROES will pick the most optimal way to guarantee that all
blocks in the requested range will return zeroes for subsequent reads.

-- 
Martin K. Petersen	Oracle Linux Engineering

  parent reply	other threads:[~2022-09-12 17:41 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-08 23:08 [PATCH mdadm v2 0/2] Discard Option for Creating Arrays Logan Gunthorpe
2022-09-08 23:08 ` [PATCH mdadm v2 1/2] mdadm: Add --discard option for Create Logan Gunthorpe
2022-09-09  9:57   ` Mariusz Tkaczyk
2022-09-09 11:54     ` Roman Mamedov
     [not found]       ` <CABdXBANrJNWjq4237k9DPRoxLVmiAUoKMZxaaLUrcMHsODwvmA@mail.gmail.com>
2022-09-09 15:31         ` Roman Mamedov
2022-09-12 17:43       ` Martin K. Petersen
2022-09-09 15:47     ` Logan Gunthorpe
2022-09-13  7:35       ` Mariusz Tkaczyk
2022-09-13 15:43         ` Logan Gunthorpe
2022-09-14 12:01           ` Mariusz Tkaczyk
2022-09-14 16:29             ` Logan Gunthorpe
2022-09-14 17:39               ` Mariusz Tkaczyk
2022-09-19  8:41   ` Xiao Ni
2022-09-21 18:45     ` Logan Gunthorpe
2022-09-08 23:08 ` [PATCH mdadm v2 2/2] manpage: Add --discard option to manpage Logan Gunthorpe
2022-09-12 17:40 ` Martin K. Petersen [this message]
     [not found]   ` <CABdXBAP0LeQMmhSLUMZ_TmnSp5xmZ4xJBkNa7HUm7094m_x9xA@mail.gmail.com>
2022-09-13  3:47     ` [PATCH mdadm v2 0/2] Discard Option for Creating Arrays Martin K. Petersen
2022-09-13 15:38   ` Logan Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1fsgwbijv.fsf@ca-mkp.ca.oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=chaitanyak@nvidia.com \
    --cc=colyli@suse.de \
    --cc=guoqing.jiang@linux.dev \
    --cc=jes@trained-monkey.org \
    --cc=jm@chia.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=sbates@raithlin.com \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox