* Write-Intent Bitmaps and disk caches
@ 2008-04-08 20:16 Ty! Boyack
2008-04-09 2:44 ` Mike Snitzer
0 siblings, 1 reply; 2+ messages in thread
From: Ty! Boyack @ 2008-04-08 20:16 UTC (permalink / raw)
To: linux-raid
I'm a little confused on the write-intent bitmaps and how they interact
with the disk caches - I would appreciate any clarification here.
The way I understand it, if a device fails in a raid set, the bitmap
will track regions that have changed since the failed device left the
array. If the device is added back in, the re-sync time is much shorter
since only some blocks have to be re-sync'ed.
But in this case it's possible that blocks A, B, and C were written to
the device, and the failure was detected on block C. Thus blocks A and
B would be probably held in the device's cache (hard drive cache, or
this problem gets worse if working with disk arrays as devices, since
they have much larger caches). When the device was re-added, block C
would presumably be re-synced as requested by the bitmap, but would A
and B be lost forever because they fell out of the cache on the drive?
Does the bitmap, or something else, take this into account? Or will
this eventually lead to an inconsistent read or data corruption down the
road? Or am I just out waving my paranoid flag today?
--
-===========================-
Ty! Boyack
NREL Unix Network Manager
ty@nrel.colostate.edu
(970) 491-1186
-===========================-
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Write-Intent Bitmaps and disk caches
2008-04-08 20:16 Write-Intent Bitmaps and disk caches Ty! Boyack
@ 2008-04-09 2:44 ` Mike Snitzer
0 siblings, 0 replies; 2+ messages in thread
From: Mike Snitzer @ 2008-04-09 2:44 UTC (permalink / raw)
To: Ty! Boyack; +Cc: linux-raid
On Tue, Apr 8, 2008 at 4:16 PM, Ty! Boyack <ty@nrel.colostate.edu> wrote:
> I'm a little confused on the write-intent bitmaps and how they interact with
> the disk caches - I would appreciate any clarification here.
>
> The way I understand it, if a device fails in a raid set, the bitmap will
> track regions that have changed since the failed device left the array. If
> the device is added back in, the re-sync time is much shorter since only
> some blocks have to be re-sync'ed.
>
> But in this case it's possible that blocks A, B, and C were written to the
> device, and the failure was detected on block C. Thus blocks A and B would
> be probably held in the device's cache (hard drive cache, or this problem
> gets worse if working with disk arrays as devices, since they have much
> larger caches). When the device was re-added, block C would presumably be
> re-synced as requested by the bitmap, but would A and B be lost forever
> because they fell out of the cache on the drive?
>
> Does the bitmap, or something else, take this into account? Or will this
> eventually lead to an inconsistent read or data corruption down the road?
> Or am I just out waving my paranoid flag today?
In your example problem both the in-memory bitmap and on-disk bitmaps
would be updated to have the dirty bit associated with block C set.
When the failed member is re-added to the array the block C will get
resynced and the associated dirty bit cleared in all bitmaps.
I'm not seeing why you have concern for blocks A and B; if the writes
completed to all members that means each individual drive marked the
associated block requests uptodate in the block device drivers beneath
Linux MD.
The MD write-intent bitmap doesn't have anything to do with the
hardware cache of each individual raid member. Blocks A and B
wouldn't get dropped unless you have writeback cache enabled on the
drives (without battery backup) and they suffer a power loss.
Mike
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-04-09 2:44 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-08 20:16 Write-Intent Bitmaps and disk caches Ty! Boyack
2008-04-09 2:44 ` Mike Snitzer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).