From: NeilBrown <neilb@suse.de>
To: Andrey Kuzmin <andrey.v.kuzmin@gmail.com>
Cc: Heinz Mauelshagen <heinzm@redhat.com>,
device-mapper development <dm-devel@redhat.com>,
Shaohua Li <shli@kernel.org>,
"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: Re: [PATCH] dm-raid: add RAID discard support
Date: Thu, 2 Oct 2014 09:15:03 +1000 [thread overview]
Message-ID: <20141002091503.26582977@notabene.brown> (raw)
In-Reply-To: <CANvN+emfkxbsvmVGr6OL1cTy+jB5t7nHhzB5Ak_6oRH24ojOxw@mail.gmail.com>
[-- Attachment #1.1: Type: text/plain, Size: 3587 bytes --]
On Wed, 1 Oct 2014 20:00:45 +0400 Andrey Kuzmin <andrey.v.kuzmin@gmail.com>
wrote:
> On Wed, Oct 1, 2014 at 6:56 AM, NeilBrown <neilb@suse.de> wrote:
> > On Wed, 24 Sep 2014 13:02:28 +0200 Heinz Mauelshagen <heinzm@redhat.com>
> > wrote:
> >
> >>
> >> Martin,
> >>
> >> thanks for the good explanation of the state of the discard union.
> >> Do you have an ETA for the 'zeroout, deallocate' ... support you mentioned?
> >>
> >> I was planning to have a followup patch for dm-raid supporting a dm-raid
> >> table
> >> line argument to prohibit discard passdown.
> >>
> >> In lieu of the fuzzy field situation wrt SSD fw and discard_zeroes_data
> >> support
> >> related to RAID4/5/6, we need that in upstream together with the initial
> >> patch.
> >>
> >> That 'no_discard_passdown' table line can be added to dm-raid RAID4/5/6
> >> table
> >> lines to avoid possible data corruption but can be avoided on RAID1/10
> >> table lines,
> >> because the latter are not suffering from any discard_zeroes_data flaw.
> >>
> >>
> >> Neil,
> >>
> >> are you going to disable discards in RAID4/5/6 shortly
> >> or rather go with your bitmap solution?
> >
> > Can I just close my eyes and hope it goes away?
> >
> > The idea of a bitmap of uninitialised areas is not a short-term solution.
> > But I'm not really keen on simply disabling discard for RAID4/5/6 either. It
> > would mean that people with good sensible hardware wouldn't be able to use
> > it properly.
> >
> > I would really rather that discard_zeroes_data were only set on devices where
> > it was actually true. Then it wouldn't be my problem any more.
> >
> > Maybe I could do a loud warning
> > "Not enabling DISCARD on RAID5 because we cannot trust committees.
> > Set "md_mod.willing_to_risk_discard=Y" if your devices reads discarded
> > sectors as zeros"
> >
> > and add an appropriate module parameter......
> >
> > While we are on the topic, maybe I should write down my thoughts about the
> > bitmap thing in case someone wants to contribute.
> >
> > There are 3 states that a 'region' can be in:
> > 1- known to be in-sync
> > 2- possibly not in sync, but it should be
> > 3- probably not in sync, contains no valuable data.
> >
> > A read from '3' should return zeroes.
> > A write to '3' should change the region to be '2'. It could either
> > write zeros before allowing the write to start, or it could just start
> > a normal resync.
> >
> > Here is a question: if a region has been discarded, are we guaranteed that
> > reads are at least stable. i.e. if I read twice will I definitely get the
> > same value?
>
> Not sure with other specs, but an NVMe-compliant SSD that supports
> discard (Dataset Management command with Deallocate attribute, in NVMe
> parlance) is, per spec, required to be deterministic when deallocated
> range is subsequently read. That's what the spec (1.1) says:
>
> The value read from a deallocated LBA shall be deterministic;
> specifically, the value returned by subsequent reads of that LBA shall
> be the same until a write occurs to that LBA. The values read from a
> deallocated LBA and its metadata (excluding protection information)
> shall be all zeros, all ones, or the last data written to the
> associated LBA and its metadata. The values read from an unwritten or
> deallocated LBA’s protection information field shall be all ones
> (indicating the protection information shall not be checked).
>
That's good to know - thanks.
NeilBrown
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
next prev parent reply other threads:[~2014-10-01 23:15 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-23 16:51 [PATCH] dm-raid: add RAID discard support heinzm
2014-09-23 21:52 ` Brassow Jonathan
2014-09-23 23:07 ` Martin K. Petersen
2014-09-23 23:33 ` NeilBrown
2014-09-24 2:20 ` Martin K. Petersen
2014-09-24 4:05 ` Brassow Jonathan
2014-09-24 4:21 ` NeilBrown
2014-09-24 4:35 ` Brassow Jonathan
2014-09-24 11:02 ` Heinz Mauelshagen
2014-10-01 2:56 ` NeilBrown
2014-10-01 11:13 ` Heinz Mauelshagen
2014-10-03 1:12 ` Martin K. Petersen
2014-10-01 13:32 ` Mike Snitzer
2014-10-01 23:34 ` NeilBrown
2014-10-02 1:31 ` Mike Snitzer
2014-10-02 2:00 ` NeilBrown
2014-10-02 4:04 ` [dm-devel] " NeilBrown
2014-10-02 13:52 ` Mike Snitzer
2014-10-02 18:00 ` Mike Snitzer
2014-10-03 1:14 ` [dm-devel] " Martin K. Petersen
2014-10-01 16:00 ` [PATCH] " Andrey Kuzmin
2014-10-01 23:15 ` NeilBrown [this message]
2014-10-01 18:57 ` Brassow Jonathan
2014-10-01 23:18 ` NeilBrown
2014-10-03 1:09 ` Martin K. Petersen
2014-09-24 14:21 ` Christoph Hellwig
2014-09-24 14:38 ` Heinz Mauelshagen
2014-09-24 15:11 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141002091503.26582977@notabene.brown \
--to=neilb@suse.de \
--cc=andrey.v.kuzmin@gmail.com \
--cc=dm-devel@redhat.com \
--cc=heinzm@redhat.com \
--cc=martin.petersen@oracle.com \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.