linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kinga Tanska <kinga.tanska@linux.intel.com>
To: Logan Gunthorpe <logang@deltatee.com>
Cc: linux-raid@vger.kernel.org, Jes Sorensen <jes@trained-monkey.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>, Xiao Ni <xni@redhat.com>,
	Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>,
	Coly Li <colyli@suse.de>,
	Chaitanya Kulkarni <chaitanyak@nvidia.com>,
	Jonmichael Hands <jm@chia.net>,
	Stephen Bates <sbates@raithlin.com>,
	Martin Oliveira <Martin.Oliveira@eideticom.com>,
	David Sloan <David.Sloan@eideticom.com>
Subject: Re: [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays
Date: Thu, 3 Nov 2022 09:14:15 +0100	[thread overview]
Message-ID: <20221103091415.00000b8c@intel.linux.com> (raw)
In-Reply-To: <20221007201037.20263-1-logang@deltatee.com>

On Fri,  7 Oct 2022 14:10:30 -0600
Logan Gunthorpe <logang@deltatee.com> wrote:

> Hi,
> 
> This is the next iteration of the patchset that added the discard
> option to mdadm. Per feedback from Martin, it's more desirable
> to use the write-zeroes functionality than rely on devices to zero
> the data on a discard request. This is because standards typically
> only require the device to do the best effort to discard data and
> may not actually discard (and thus zero) it all in some circumstances.
> 
> This version of the patch set adds the --write-zeroes option which
> will imply --assume-clean and write zeros to the data region in
> each disk before starting the array. This can take some time so
> each disk is done in parallel in its own fork. To make the forking
> code easier to understand this patch set also starts with some
> cleanup of the existing Create code.
> 
> We tested write-zeroes requests on a number of modern nvme drives of
> various manufacturers and found most are not as optimized as the
> discard path. A couple drives that were tested did not support
> write-zeroes at all but still performed similarly with the kernel
> falling back to writing zero pages. Typically we see it take on the
> order of one minute per 100GB of data zeroed.
> 
> One reason write-zeroes is slower than discard is that today's NVMe
> devices only allow about 2MB to be zeroed in one command where as
> the entire drive can typically be discarded in one command. Partly,
> this is a limitation of the spec as there are only 16 bits avalaible
> in the write-zeros command size but drives still don't max this out.
> Hopefully, in the future this will all be optimized a bit more
> and this work will be able to take advantage of that.
> 
> Logan
> 
> --
> 
> Changes since v3:
>    * Store the pid in a local variable instead of the mdinfo struct
>     (per Mariusz and Xiao)
> 
> Changes since v2:
> 
>    * Use write-zeroes instead of discard to zero the disks (per
>      Martin)
>    * Due to the time required to zero the disks, each disk is
>      now done in parallel with separate forks of the process.
>    * In order to add the forking some refactoring was done on the
>      Create() function to make it easier to understand
>    * Added a pr_info() call so that some prints can be done
>      to stdout instead of stdour (per Mariusz)
>    * Added KIB_TO_BYTES and SEC_TO_BYTES helpers (per Mariusz)
>    * Added a test to the mdadm test suite to test the option
>      works.
>    * Fixed up how the size and offset are calculated with some
>      great information from Xiao.
> 
> Changes since v1:
> 
>    * Discard the data in the devices later in the create process
>      while they are already open. This requires treating the
>      s.discard option the same as the s.assume_clean option.
>      Per Mariusz.
>    * A couple other minor cleanup changes from Mariusz.
> 
> 
> *** BLURB HERE ***
> 
> Logan Gunthorpe (7):
>   Create: goto abort_locked instead of return 1 in error path
>   Create: remove safe_mode_delay local variable
>   Create: Factor out add_disks() helpers
>   mdadm: Introduce pr_info()
>   mdadm: Add --write-zeros option for Create
>   tests/00raid5-zero: Introduce test to exercise --write-zeros.
>   manpage: Add --write-zeroes option to manpage
> 
>  Create.c           | 479
> ++++++++++++++++++++++++++++----------------- ReadMe.c           |
> 2 + mdadm.8.in         |  16 ++
>  mdadm.c            |   9 +
>  mdadm.h            |   7 +
>  tests/00raid5-zero |  12 ++
>  6 files changed, 350 insertions(+), 175 deletions(-)
>  create mode 100644 tests/00raid5-zero
> 
> 
> base-commit: 8b668d4aa3305af5963162b7499b128bd71f8f29
> --
> 2.30.2

Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com>

      parent reply	other threads:[~2022-11-03  8:14 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-07 20:10 [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 1/7] Create: goto abort_locked instead of return 1 in error path Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 2/7] Create: remove safe_mode_delay local variable Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 3/7] Create: Factor out add_disks() helpers Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 4/7] mdadm: Introduce pr_info() Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 5/7] mdadm: Add --write-zeros option for Create Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 6/7] tests/00raid5-zero: Introduce test to exercise --write-zeros Logan Gunthorpe
2022-10-07 20:10 ` [PATCH mdadm v4 7/7] manpage: Add --write-zeroes option to manpage Logan Gunthorpe
2022-10-12  1:09 ` [PATCH mdadm v4 0/7] Write Zeroes option for Creating Arrays Xiao Ni
2022-10-12 16:59   ` Logan Gunthorpe
2022-10-13  1:33     ` Martin K. Petersen
2022-10-13  7:51       ` Xiao Ni
2022-10-26  2:41         ` Martin K. Petersen
2022-10-27  8:44           ` Xiao Ni
2022-11-16 17:11       ` Logan Gunthorpe
2022-11-03  8:14 ` Kinga Tanska [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221103091415.00000b8c@intel.linux.com \
    --to=kinga.tanska@linux.intel.com \
    --cc=David.Sloan@eideticom.com \
    --cc=Martin.Oliveira@eideticom.com \
    --cc=chaitanyak@nvidia.com \
    --cc=colyli@suse.de \
    --cc=guoqing.jiang@linux.dev \
    --cc=jes@trained-monkey.org \
    --cc=jm@chia.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=mariusz.tkaczyk@linux.intel.com \
    --cc=sbates@raithlin.com \
    --cc=xni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).