From: Kinga Tanska <kinga.tanska@linux.intel.com>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: linux-raid@vger.kernel.org, Jes Sorensen <jes@trained-monkey.org>,
Guoqing Jiang <guoqing.jiang@linux.dev>, Xiao Ni <xni@redhat.com>,
Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
Coly Li <colyli@suse.de>,
Chaitanya Kulkarni <chaitanyak@nvidia.com>,
Jonmichael Hands <jm@chia.net>,
Stephen Bates <sbates@raithlin.com>,
Martin Oliveira <Martin.Oliveira@eideticom.com>,
David Sloan <David.Sloan@eideticom.com>
Subject: Re: [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays
Date: Thu, 3 Nov 2022 09:14:15 +0100 [thread overview]
Message-ID: <20221103091415.00000b8c@intel.linux.com> (raw)
In-Reply-To: <20221007201037.20263-1-logang@deltatee.com>
On Fri, 7 Oct 2022 14:10:30 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:
> Hi,
>
> This is the next iteration of the patchset that added the discard
> option to mdadm. Per feedback from Martin, it's more desirable
> to use the write-zeroes functionality than rely on devices to zero
> the data on a discard request. This is because standards typically
> only require the device to do the best effort to discard data and
> may not actually discard (and thus zero) it all in some circumstances.
>
> This version of the patch set adds the --write-zeroes option which
> will imply --assume-clean and write zeros to the data region in
> each disk before starting the array. This can take some time so
> each disk is done in parallel in its own fork. To make the forking
> code easier to understand this patch set also starts with some
> cleanup of the existing Create code.
>
> We tested write-zeroes requests on a number of modern nvme drives of
> various manufacturers and found most are not as optimized as the
> discard path. A couple drives that were tested did not support
> write-zeroes at all but still performed similarly with the kernel
> falling back to writing zero pages. Typically we see it take on the
> order of one minute per 100GB of data zeroed.
>
> One reason write-zeroes is slower than discard is that today's NVMe
> devices only allow about 2MB to be zeroed in one command where as
> the entire drive can typically be discarded in one command. Partly,
> this is a limitation of the spec as there are only 16 bits avalaible
> in the write-zeros command size but drives still don't max this out.
> Hopefully, in the future this will all be optimized a bit more
> and this work will be able to take advantage of that.
>
> Logan
>
> --
>
> Changes since v3:
> * Store the pid in a local variable instead of the mdinfo struct
> (per Mariusz and Xiao)
>
> Changes since v2:
>
> * Use write-zeroes instead of discard to zero the disks (per
> Martin)
> * Due to the time required to zero the disks, each disk is
> now done in parallel with separate forks of the process.
> * In order to add the forking some refactoring was done on the
> Create() function to make it easier to understand
> * Added a pr_info() call so that some prints can be done
> to stdout instead of stdour (per Mariusz)
> * Added KIB_TO_BYTES and SEC_TO_BYTES helpers (per Mariusz)
> * Added a test to the mdadm test suite to test the option
> works.
> * Fixed up how the size and offset are calculated with some
> great information from Xiao.
>
> Changes since v1:
>
> * Discard the data in the devices later in the create process
> while they are already open. This requires treating the
> s.discard option the same as the s.assume_clean option.
> Per Mariusz.
> * A couple other minor cleanup changes from Mariusz.
>
>
> *** BLURB HERE ***
>
> Logan Gunthorpe (7):
> Create: goto abort_locked instead of return 1 in error path
> Create: remove safe_mode_delay local variable
> Create: Factor out add_disks() helpers
> mdadm: Introduce pr_info()
> mdadm: Add --write-zeros option for Create
> tests/00raid5-zero: Introduce test to exercise --write-zeros.
> manpage: Add --write-zeroes option to manpage
>
> Create.c | 479
> ++++++++++++++++++++++++++++----------------- ReadMe.c |
> 2 + mdadm.8.in | 16 ++
> mdadm.c | 9 +
> mdadm.h | 7 +
> tests/00raid5-zero | 12 ++
> 6 files changed, 350 insertions(+), 175 deletions(-)
> create mode 100644 tests/00raid5-zero
>
>
> base-commit: 8b668d4aa3305af5963162b7499b128bd71f8f29
> --
> 2.30.2
Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>
prev parent reply other threads:[~2022-11-03 8:14 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-07 20:10 [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 1/7] Create: goto abort_locked instead of return 1 in error path Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 2/7] Create: remove safe_mode_delay local variable Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 3/7] Create: Factor out add_disks() helpers Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 4/7] mdadm: Introduce pr_info() Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 5/7] mdadm: Add --write-zeros option for Create Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 6/7] tests/00raid5-zero: Introduce test to exercise --write-zeros Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 7/7] manpage: Add --write-zeroes option to manpage Logan Gunthorpe
2022-10-12 1:09 ` [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Xiao Ni
2022-10-12 16:59 ` Logan Gunthorpe
2022-10-13 1:33 ` Martin K. Petersen
2022-10-13 7:51 ` Xiao Ni
2022-10-26 2:41 ` Martin K. Petersen
2022-10-27 8:44 ` Xiao Ni
2022-11-16 17:11 ` Logan Gunthorpe
2022-11-03 8:14 ` Kinga Tanska [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221103091415.00000b8c@intel.linux.com \
--to=kinga.tanska@linux.intel.com \
--cc=David.Sloan@eideticom.com \
--cc=Martin.Oliveira@eideticom.com \
--cc=chaitanyak@nvidia.com \
--cc=colyli@suse.de \
--cc=guoqing.jiang@linux.dev \
--cc=jes@trained-monkey.org \
--cc=jm@chia.net \
--cc=linux-raid@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=mariusz.tkaczyk@linux.intel.com \
--cc=sbates@raithlin.com \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).