Linux RAID subsystem development
 help / color / mirror / Atom feed
* Re: Question about resync in RAID5
  2013-04-16  9:59 Robin Dong
@ 2013-04-15 20:27 ` Oliver Schinagl
  2013-04-16 15:20   ` Keith Keller
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Schinagl @ 2013-04-15 20:27 UTC (permalink / raw)
  To: Robin Dong; +Cc: linux-raid

On 16-04-13 11:59, Robin Dong wrote:
> Dear Raid experts,
>
> I have a soft RAID5 volume and after one disk failed I replaced it with
> a new hard-disk. Then the raid5 volume begin to resync with WHOLE new
> disk.
> There is only 1G data in the RAID5 volume so I think resync whole disk
> is not efficient.
The md driver however does not know this. It syncs everything.

Having said that, how do you know there's only 1 GB in use? Maybe there 
is a hidden partition for the remainder of the disk? Or, maybe it houses 
a encrypted container and in that container a nother encrpyted 
countainer (check truecrypt on deniability).

Fact is, you do not know how much data is or is not used, well not for sure.
> Take ZFS for example, when replacing with a new disk, it only resync the
> data which are written after the creation of the volume.
but md is only the raid layer, zfs is the lower layer AND the 
filesystem, you can't really fairly compare them. Btrfs may do this, but 
I don't know and don't think it's ready as replacement yet anyway.
>
> Is there any method to just resync WRITTEN data to new-added-disk ? Or
> any developing plan to add this feature?
I highly doubt it, since, as said above, the md layer could not ever 
possibly know what data is on the disk.
>
> ________________________________
>
> This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.
>
> ±¾µçÓÊ(°üÀ¨Èκθ½¼þ)¿ÉÄܺ¬ÓлúÃÜ×ÊÁϲ¢ÊÜ·¨Âɱ£»¤¡£ÈçÄú²»ÊÇÕýÈ·µÄÊÕ¼þÈË£¬ÇëÄúÁ¢¼´É¾³ý±¾Óʼþ¡£Çë²»Òª½«±¾µçÓʽøÐи´ÖƲ¢ÓÃ×÷ÈÎºÎÆäËûÓÃ;¡¢»ò͸¶±¾ÓʼþÖ®ÄÚÈÝ¡£Ð»Ð»¡£
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Question about resync in RAID5
@ 2013-04-16  9:59 Robin Dong
  2013-04-15 20:27 ` Oliver Schinagl
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Dong @ 2013-04-16  9:59 UTC (permalink / raw)
  To: linux-raid; +Cc: Robin Dong

Dear Raid experts,

I have a soft RAID5 volume and after one disk failed I replaced it with
a new hard-disk. Then the raid5 volume begin to resync with WHOLE new
disk.
There is only 1G data in the RAID5 volume so I think resync whole disk
is not efficient.
Take ZFS for example, when replacing with a new disk, it only resync the
data which are written after the creation of the volume.

Is there any method to just resync WRITTEN data to new-added-disk ? Or
any developing plan to add this feature?

________________________________

This email (including any attachments) is confidential and may be legally privileged. If you received this email in error, please delete it immediately and do not copy it or use it for any purpose or disclose its contents to any other person. Thank you.

±¾µçÓÊ(°üÀ¨Èκθ½¼þ)¿ÉÄܺ¬ÓлúÃÜ×ÊÁϲ¢ÊÜ·¨Âɱ£»¤¡£ÈçÄú²»ÊÇÕýÈ·µÄÊÕ¼þÈË£¬ÇëÄúÁ¢¼´É¾³ý±¾Óʼþ¡£Çë²»Òª½«±¾µçÓʽøÐи´ÖƲ¢ÓÃ×÷ÈÎºÎÆäËûÓÃ;¡¢»ò͸¶±¾ÓʼþÖ®ÄÚÈÝ¡£Ð»Ð»¡£
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Question about resync in RAID5
@ 2013-04-16 10:02 Robin Dong
  2013-04-16 10:24 ` Robin Hill
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Dong @ 2013-04-16 10:02 UTC (permalink / raw)
  To: linux-raid

Dear Raid experts,

I have a soft RAID5 volume and after one disk failed I replaced it with
a new hard-disk. Then the raid5 volume begin to resync with WHOLE new
disk.
There is only 1G data in the RAID5 volume so I think resync whole disk
is not efficient.
Take ZFS for example, when replacing with a new disk, it only resync the
data which are written after the creation of the volume.

Is there any method to just resync WRITTEN data to new-added-disk ? Or
any developing plan to add this feature?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about resync in RAID5
  2013-04-16 10:02 Question about resync in RAID5 Robin Dong
@ 2013-04-16 10:24 ` Robin Hill
  2013-04-16 15:46   ` Roy Sigurd Karlsbakk
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Hill @ 2013-04-16 10:24 UTC (permalink / raw)
  To: Robin Dong; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]

On Tue Apr 16, 2013 at 06:02:40PM +0800, Robin Dong wrote:

> Dear Raid experts,
> 
> I have a soft RAID5 volume and after one disk failed I replaced it with
> a new hard-disk. Then the raid5 volume begin to resync with WHOLE new
> disk.
> There is only 1G data in the RAID5 volume so I think resync whole disk
> is not efficient.
> Take ZFS for example, when replacing with a new disk, it only resync the
> data which are written after the creation of the volume.
> 
> Is there any method to just resync WRITTEN data to new-added-disk ? Or
> any developing plan to add this feature?

No, because md is a block device it has no knowledge of what data has
actually been written to the disk or where it is stored. This means it
has to rebuild the entire disk.

There are plans to improve this by keeping track of which stripes have
been written to, which should help (and will also mean no need for an
initial RAID sync). I don't know whether this has progressed beyond the
idea stage yet though.

The alternative is to use a filesystem which incorporates RAID, hence
knows what needs keeping in sync (ZFS, possibly btrfs - I don't know
where they're up to with that).

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about resync in RAID5
  2013-04-15 20:27 ` Oliver Schinagl
@ 2013-04-16 15:20   ` Keith Keller
  0 siblings, 0 replies; 9+ messages in thread
From: Keith Keller @ 2013-04-16 15:20 UTC (permalink / raw)
  To: linux-raid

On 2013-04-15, Oliver Schinagl <oliver+list@schinagl.nl> wrote:
> I highly doubt it, since, as said above, the md layer could not ever 
> possibly know what data is on the disk.

In theory it could, if it recorded metadata about what it's written over
the array's lifetime.  Some newer 3ware controllers claim to support
this feature (it's called something like "rapid RAID recovery").  Their
docs don't describe how or where this information is stored.

But AFAICT md doesn't support this feature.

--keith

-- 
kkeller@wombat.san-francisco.ca.us



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about resync in RAID5
  2013-04-16 10:24 ` Robin Hill
@ 2013-04-16 15:46   ` Roy Sigurd Karlsbakk
  2013-04-16 16:55     ` Robin Hill
  0 siblings, 1 reply; 9+ messages in thread
From: Roy Sigurd Karlsbakk @ 2013-04-16 15:46 UTC (permalink / raw)
  To: Robin Hill; +Cc: linux-raid, Robin Dong

> > Is there any method to just resync WRITTEN data to new-added-disk ?
> > Or any developing plan to add this feature?
> 
> No, because md is a block device it has no knowledge of what data has
> actually been written to the disk or where it is stored. This means it
> has to rebuild the entire disk.
> 
> There are plans to improve this by keeping track of which stripes have
> been written to, which should help (and will also mean no need for an
> initial RAID sync). I don't know whether this has progressed beyond
> the idea stage yet though.

Interesting - do you have any docs about this? I haven't seen any talk on this list about it...

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 98013356
roy@karlsbakk.net
http://blogg.karlsbakk.net/
GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about resync in RAID5
  2013-04-16 15:46   ` Roy Sigurd Karlsbakk
@ 2013-04-16 16:55     ` Robin Hill
       [not found]       ` <CANsebLH0PcjV2nmcQ0TmEE_8XxF6pi53baSenV3qKwdRxioK+Q@mail.gmail.com>
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Hill @ 2013-04-16 16:55 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: linux-raid, Robin Dong

[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]

On Tue Apr 16, 2013 at 05:46:07PM +0200, Roy Sigurd Karlsbakk wrote:

> > > Is there any method to just resync WRITTEN data to new-added-disk ?
> > > Or any developing plan to add this feature?
> > 
> > No, because md is a block device it has no knowledge of what data has
> > actually been written to the disk or where it is stored. This means it
> > has to rebuild the entire disk.
> > 
> > There are plans to improve this by keeping track of which stripes have
> > been written to, which should help (and will also mean no need for an
> > initial RAID sync). I don't know whether this has progressed beyond
> > the idea stage yet though.
> 
> Interesting - do you have any docs about this? I haven't seen any talk
> on this list about it...
> 
It's from Neil's 2011 MD/RAID roadmap:
    http://neil.brown.name/blog/20110216044002

The basic idea was mentioned in a 2009 roadmap as well, but is a lot
more fleshed out in the above one.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about resync in RAID5
       [not found]       ` <CANsebLH0PcjV2nmcQ0TmEE_8XxF6pi53baSenV3qKwdRxioK+Q@mail.gmail.com>
@ 2013-04-22  0:15         ` NeilBrown
  2013-04-22  9:16           ` Goswin von Brederlow
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2013-04-22  0:15 UTC (permalink / raw)
  To: Robin Dong; +Cc: robin, linux-raid, Tao Ma

[-- Attachment #1: Type: text/plain, Size: 2595 bytes --]

On Wed, 17 Apr 2013 12:29:07 +0800 Robin Dong <robin.k.dong@gmail.com> wrote:

> 2013/4/17 Robin Hill <robin@robinhill.me.uk>
> >
> > On Tue Apr 16, 2013 at 05:46:07PM +0200, Roy Sigurd Karlsbakk wrote:
> >
> > > > > Is there any method to just resync WRITTEN data to new-added-disk ?
> > > > > Or any developing plan to add this feature?
> > > >
> > > > No, because md is a block device it has no knowledge of what data has
> > > > actually been written to the disk or where it is stored. This means it
> > > > has to rebuild the entire disk.
> > > >
> > > > There are plans to improve this by keeping track of which stripes have
> > > > been written to, which should help (and will also mean no need for an
> > > > initial RAID sync). I don't know whether this has progressed beyond
> > > > the idea stage yet though.
> > >
> > > Interesting - do you have any docs about this? I haven't seen any talk
> > > on this list about it...
> > >
> > It's from Neil's 2011 MD/RAID roadmap:
> >     http://neil.brown.name/blog/20110216044002
> >
> > The basic idea was mentioned in a 2009 roadmap as well, but is a lot
> > more fleshed out in the above one.
> 
> 
> Hi, Neil
> 
> Has anyone began to develope non-sync-bitmap feature? If not, I want to
> take a chance.
> Will the non-sync-bitmap be built on the write-intent-bitmap or I need to
> create a new bitmap in 'array state info' of v1.x metadata?
> 

Hi Robin et al,

No, no implementation has been started.

You cannot use the same bits as the write intent bitmap as they mean
something different.  But you could possibly use the same storage space.
i.e. have an alternate style of bitmap where there are two bits per 'chunk'.
One bit maps "this was written recently, a resync might be needed after a
crash" and the other means "This has been written since array create, so the
chunk needs to be recovered to any spare".

I don't know if this is the best approach - I haven't really thought about it
much.

There are several parts to this:

 1/ decide how to store the new bitmap on disk, and implement that
 2/ decide how to store the new bitmap in memory and implement that.
    It might be the same...
 3/ Set a bit in the new bitmap on each write (if it isn't set), and
    trigger a resync of that chunk.
 4/ Update the recovery process for each level to honour this new bit
 5/ Allow "discard" operations to clear these bits.

If you do proceed with this,  feel free to post a more concrete design before
proceeding to code - or just post code if you prefer.

NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Question about resync in RAID5
  2013-04-22  0:15         ` NeilBrown
@ 2013-04-22  9:16           ` Goswin von Brederlow
  0 siblings, 0 replies; 9+ messages in thread
From: Goswin von Brederlow @ 2013-04-22  9:16 UTC (permalink / raw)
  To: NeilBrown; +Cc: Robin Dong, robin, linux-raid, Tao Ma

On Mon, Apr 22, 2013 at 10:15:52AM +1000, NeilBrown wrote:
> On Wed, 17 Apr 2013 12:29:07 +0800 Robin Dong <robin.k.dong@gmail.com> wrote:
> 
> > 2013/4/17 Robin Hill <robin@robinhill.me.uk>
> > >
> > > On Tue Apr 16, 2013 at 05:46:07PM +0200, Roy Sigurd Karlsbakk wrote:
> > >
> > > > > > Is there any method to just resync WRITTEN data to new-added-disk ?
> > > > > > Or any developing plan to add this feature?
> > > > >
> > > > > No, because md is a block device it has no knowledge of what data has
> > > > > actually been written to the disk or where it is stored. This means it
> > > > > has to rebuild the entire disk.
> > > > >
> > > > > There are plans to improve this by keeping track of which stripes have
> > > > > been written to, which should help (and will also mean no need for an
> > > > > initial RAID sync). I don't know whether this has progressed beyond
> > > > > the idea stage yet though.
> > > >
> > > > Interesting - do you have any docs about this? I haven't seen any talk
> > > > on this list about it...
> > > >
> > > It's from Neil's 2011 MD/RAID roadmap:
> > >     http://neil.brown.name/blog/20110216044002
> > >
> > > The basic idea was mentioned in a 2009 roadmap as well, but is a lot
> > > more fleshed out in the above one.
> > 
> > 
> > Hi, Neil
> > 
> > Has anyone began to develope non-sync-bitmap feature? If not, I want to
> > take a chance.
> > Will the non-sync-bitmap be built on the write-intent-bitmap or I need to
> > create a new bitmap in 'array state info' of v1.x metadata?
> > 
> 
> Hi Robin et al,
> 
> No, no implementation has been started.
> 
> You cannot use the same bits as the write intent bitmap as they mean
> something different.  But you could possibly use the same storage space.
> i.e. have an alternate style of bitmap where there are two bits per 'chunk'.
> One bit maps "this was written recently, a resync might be needed after a
> crash" and the other means "This has been written since array create, so the
> chunk needs to be recovered to any spare".
> 
> I don't know if this is the best approach - I haven't really thought about it
> much.
> 
> There are several parts to this:
> 
>  1/ decide how to store the new bitmap on disk, and implement that
>  2/ decide how to store the new bitmap in memory and implement that.
>     It might be the same...
>  3/ Set a bit in the new bitmap on each write (if it isn't set), and
>     trigger a resync of that chunk.
>  4/ Update the recovery process for each level to honour this new bit
>  5/ Allow "discard" operations to clear these bits.
> 
> If you do proceed with this,  feel free to post a more concrete design before
> proceeding to code - or just post code if you prefer.
> 
> NeilBrown

I would suggest using two bitmaps instead of one with 2bits per chunk. Why?

- possibly different granularity for each bitmap
- write/rewrite of blocks independent of first write/discard, they
  change with different frequency. Esspecially when discarding from a
  cron job instead on-the-fly through the FS.

Maybe not even use a bitmap for discard. Some tree form that would be
able to handle both block sized chunks being discarded and gigabytes
of continious space being in the same state. Something that uses a
bitmap for parts that fragment down to block sized chunks and segments
for continious blocks.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-04-22  9:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-16 10:02 Question about resync in RAID5 Robin Dong
2013-04-16 10:24 ` Robin Hill
2013-04-16 15:46   ` Roy Sigurd Karlsbakk
2013-04-16 16:55     ` Robin Hill
     [not found]       ` <CANsebLH0PcjV2nmcQ0TmEE_8XxF6pi53baSenV3qKwdRxioK+Q@mail.gmail.com>
2013-04-22  0:15         ` NeilBrown
2013-04-22  9:16           ` Goswin von Brederlow
  -- strict thread matches above, loose matches on Subject: below --
2013-04-16  9:59 Robin Dong
2013-04-15 20:27 ` Oliver Schinagl
2013-04-16 15:20   ` Keith Keller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox