Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Shaohua Li <shli@fb.com>
To: NeilBrown <neilb@suse.de>
Cc: dan.j.williams@intel.com, linux-raid@vger.kernel.org,
	songliubraving@fb.com, Kernel-team@fb.com
Subject: Re: [RFC] raid5: add a log device to fix raid5/6 write hole issue
Date: Wed, 8 Apr 2015 23:15:47 -0700	[thread overview]
Message-ID: <20150409061545.GA864165@devbig257.prn2.facebook.com> (raw)
In-Reply-To: <20150409150459.320c668a@notabene.brown>

On Thu, Apr 09, 2015 at 03:04:59PM +1000, NeilBrown wrote:
> On Wed, 8 Apr 2015 17:43:11 -0700 Shaohua Li <shli@fb.com> wrote:
> 
> > Hi,
> > This is what I'm working on now, and hopefully had the basic code
> > running next week. The new design will do cache and fix the write hole
> > issue too. Before I post the code out, I'd like to check if the design
> > has obvious issues.
> 
> I can't say I'm excited about it....
> 
> You still haven't explained why you would ever want to read data from the
> "cache"?  Why not just keep everything in the stripe-cache until it is safe
> in the RAID.   I asked before and you said:
> 
> >> I'm not enthusiastic to use stripe cache though, we can't keep all data
> >> in stripe cache. What we really need is an index.
> 
> which is hardly an answer.  Why cannot you keep all the data in the stripe
> cache?  How much data is there? How much memory can you afford to dedicate?
> 
> You must have some very long sustained bursts of writes which are much faster
> than the RAID can accept in order to not be able to keep everything in memory.
> 
> 
> Your cache layout seems very rigid.  I would much rather a layout that was
> very general and flexible.  If you want to always allocate a chunk at a time
> then fine, but don't force that on the cache layout.
> 
> The log really should be very simple.  A block describing what comes next,
> then lots of data/parity.  Then another block and more data etc etc.
> Each metadata  block points to the next one.
> If you need an index of the cache, you keep that in memory.  On restart, you
> read all of the metadata blocks and  built up the index.
> 
> I think that space in the log should be reclaimed in exactly the order that
> it is written, so the active part of the log is contiguous.   Obviously
> individual blocks become inactive in arbitrary order as they are written to
> the RAID, but each extent of the log becomes free in order.
> If you want that to happen out of order, you would need to present a very
> good reason.

I came to the same idea when I'm thinking about a caching layer, but the
memory size is the main blocking issue. If the solution requires a large
amount of extra memory, it's not cost effective, so a hard sell to
replace hardware raid with software raid. The design completely depends
on if we can store all data in memory. I don't have an anwser yet how
much memory we should use to make the aggregation efficient. Guess only
number can talk. I'll try to collect some data and get back to you.

Thanks,
Shaohua

  reply	other threads:[~2015-04-09  6:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-30 22:25 [RFC] raid5: add a log device to fix raid5/6 write hole issue Shaohua Li
2015-04-01  3:47 ` Dan Williams
2015-04-01  5:53   ` Shaohua Li
2015-04-01  6:02     ` NeilBrown
2015-04-01 17:14       ` Shaohua Li
2015-04-01 18:36   ` Piergiorgio Sartor
2015-04-01 18:46     ` Dan Williams
2015-04-01 20:07       ` Jiang, Dave
2015-04-01 18:46     ` Alireza Haghdoost
2015-04-01 19:57       ` Wols Lists
2015-04-01 20:04         ` Alireza Haghdoost
2015-04-01 20:18           ` Wols Lists
2015-04-01 20:17         ` Jens Axboe
2015-04-01 21:53 ` NeilBrown
2015-04-01 23:40   ` Shaohua Li
2015-04-02  0:19     ` NeilBrown
2015-04-02  4:07       ` Shaohua Li
2015-04-09  0:43         ` Shaohua Li
2015-04-09  5:04           ` NeilBrown
2015-04-09  6:15             ` Shaohua Li [this message]
2015-04-09 15:37               ` Dan Williams
2015-04-09 16:03                 ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150409061545.GA864165@devbig257.prn2.facebook.com \
    --to=shli@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=songliubraving@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox