From: NeilBrown <neilb@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: linux-raid@vger.kernel.org
Subject: Re: [RFC]raid5: add an option to avoid copy data from bio to stripe cache
Date: Mon, 28 Apr 2014 20:08:43 +1000 [thread overview]
Message-ID: <20140428200843.5b32cf8b@notabene.brown> (raw)
In-Reply-To: <20140428072821.GB28726@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 2223 bytes --]
On Mon, 28 Apr 2014 15:28:21 +0800 Shaohua Li <shli@kernel.org> wrote:
> On Mon, Apr 28, 2014 at 05:06:28PM +1000, NeilBrown wrote:
> > On Mon, 28 Apr 2014 14:58:41 +0800 Shaohua Li <shli@kernel.org> wrote:
> >
> > >
> > > The stripe cache has two goals:
> > > 1. cache data, so next time if data can be found in stripe cache, disk access
> > > can be avoided.
> > > 2. stable data. data is copied from bio to stripe cache and calculated parity.
> > > data written to disk is from stripe cache, so if upper layer changes bio data,
> > > data written to disk isn't impacted.
> > >
> > > In my environment, I can guarantee 2 will not happen. For 1, it's not common
> > > too. block plug mechanism will dispatch a bunch of sequentail small requests
> > > together. And since I'm using SSD, I'm using small chunk size. It's rare case
> > > stripe cache is really useful.
> > >
> > > So I'd like to avoid the copy from bio to stripe cache and it's very helpful
> > > for performance. In my 1M randwrite tests, avoid the copy can increase the
> > > performance more than 30%.
> > >
> > > Of course, this shouldn't be enabled by default, so I added an option to
> > > control it.
> >
> > I'm happy to avoid copying when we know that we can.
> >
> > I'm not really happy about using a sysfs attribute to control it.
> >
> > How do you guarantee that '2' won't happen?
> >
> > BTW I don't see '1' as important. The stripe cache is really for gathering
> > writes together to increase the chance of full-stripe writes, and for
> > handling synchronisation between IO and resync/reshape/etc. The copying is
> > primarily for stability.
>
> We are using raid5 in a SCSI target appliance. BIO is dispatched from a SCSI
> target layer (like LIO) and no filesytem is involved, so I can guarantee the
> BIO data is stable.
>
> What's your favorite way to control it?
I would like a bio flag with the meaning "this data is stable until bi_end_io
is called".
I had hoped something like that would come of out the stable-pages effort,
but that focussed on meeting the needs for filesystems more than that needs
of devices.
Maybe we just need to make one ourselves.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-04-28 10:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-28 6:58 [RFC]raid5: add an option to avoid copy data from bio to stripe cache Shaohua Li
2014-04-28 7:06 ` NeilBrown
2014-04-28 7:28 ` Shaohua Li
2014-04-28 10:08 ` NeilBrown [this message]
2014-04-28 10:17 ` Christoph Hellwig
2014-04-28 10:44 ` NeilBrown
2014-04-29 2:01 ` Shaohua Li
2014-04-29 7:07 ` NeilBrown
2014-04-29 11:13 ` Shaohua Li
2014-05-21 7:01 ` NeilBrown
2014-05-21 9:57 ` Shaohua Li
2014-05-29 7:01 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140428200843.5b32cf8b@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=shli@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).